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Pre  f ac  e 


The  purpose  of  this  research  was  to  develop  software 
which  automated  the  current  simulation  output  analysis 
techniques  of  batch  means  and  independent  replications,  and 
to  determine  the  applicability  of  employing  Kalman  filtering 
for  simulation  output  analysis  on  an  M/M/1  queue.  Both  of 
these  objectives  were  met,  with  several  areas  identified  for 
follow-on  research. 

During  the  course  of  this  thesis,  many  people  have  gone 
out  of  their  way  to  help  me.  I  would  like  to  thank  Major 
Kenneth  Bauer,  Dr.  James  Chrissis,  and,  especially.  Dr. 

Peter  Maybeck  for  being  on  my  committee.  Major  Bauer's 
insight  along  with  Dr.  Maybeck's  knowledge  of  the  Kalman 
filter,  and  patience  made  many  areas  of  investigation 
possible. 

I  would  also  like  to  thank  Mr.  Stan  Musick,  the 
designer  of  the  Multimode  Simulation  for  Optimal  Filter 
Evaluation  (MSOFE)  program,  and  Captain  Britt  Snodgrass  for 
their  help  in  employing  MSOFE.  I  would  also  like  to  take 
this  opportunity  to  express  my  thankfulness  to  Captain  Mark 
Gallagher  for  the  insights  and  many  hours  of  help  he 
provided. 

Finally,  I  wish  to  thank  my  wife,  Carla,  and  my 
daughter,  Megan,  for  their  support  throughout  this  program. 

i  i 


Table  of  Contents 


Preface .  i  i 

List  of  Figures .  v 

List  of  Tables . vii 

Abst  ract . viii 

I.  Introduction . 1 

Purpose  .  3 

Batch  Means  .............  4 

Independent  Replications  .  5 

Kalman  filtering  .  5 

Time  Series  Models  .  5 

Queueing  Simulation  Models  .  6 

Sub-objectives  .  6 

Overview  of  Chapters  .  8 

II.  Historical  Development  .  9 

Background  .  9 

Steady  State  vs.  Terminating 

Simul  at  ions .  11 

Moving  Averages  .  11 

Confidence  Intervals  .  14 

Data  Truncation .  15 

Cubic  Spline  and  Linear  Interpolation  .  .  17 

Independent  Replications  .  18 

Batch  Means .  19 

Kalman  Filtering  .  22 

Static  Case .  23 

Adaptive  Filtering  .  31 

III.  Methodology .  37 

Overview .  37 

Model  Formulation .  37 

Output  Analysis  Program  .  39 

Application  of  Kalman  Filtering  .  45 

Data  Formatting .  50 

Summary .  55 


i  i  i 


IV.  Results  and  Discussion .  57 

Output  Analysis .  57 

Batch  Means  . .  58 

Independent  Replications .  61 

Kalman  Filter  .  64 

Calculation  of  System  Variance  .  74 

V.  Conclusions  and  Recommendations  .  78 

Conclusions .  78 

Areas  for  Future  Development .  80 

Appendices .  84 

SLAM  II  and  FORTRAN  Code  for  M/M/1  Queue  .  .  .  84 

Output  Analysis  Program  User's  Guide  .  86 

Output  Analysis  Program  .  93 

Auxiliary  Turbo  Pascal  Programs  Used  .  116 

Array  of  Time-in-System  to  T i me-o f -Obser vat i on .  119 

Bibliography .  120 

Vita .  123 


List  of  F iQures 

F 1 gur e  Page 

1.  Representative  Simulation  Model  Output  Data  ...  13 

2.  Moving  Averaged  Simulation  Output  .  13 

3.  Flowchart  of  M/M/1  Queueing  Simulation  Model  .  .  39 

4.  Configuration  of  SLAM  II  Output .  40 

5.  Flowchart  of  Output  Analysis  Program  .  40 

6.  Output  Analysis  Program  Graph  of  Data  .  43 

7.  Batch  Means  Output  Analysis  Data  .  44 

8.  Independent  Replications  Output  Analysis  Data  .  .  44 

9.  Autocorrelation  Plot  of  Equally  Spaced  Output  .  .  47 

10.  Kalman  Filter  Values  from  FORECAST  MASTER  ....  48 

11.  State  Space  Modelling  Statistics  .  49 

12.  Flowchart  of  Data  Formatting .  51 

13.  Plot  of  Initial  Output  Data .  53 

14.  Partial  Cubic  Spline  Interpolation  of  Output  .  .  54 

15.  Plot  of  Linearly  Interpolated  Data  . .  54 

16.  Configuration  of  SLAM  II  Output .  59 

17.  Graph  of  Moving  Averaged  Output  Data  .  59 

18.  Graph  of  Averaged  Replications  Data .  62 

19.  Comparison  of  Estimated  and  Analytic  Means  .  .  .  64 

20.  FORECAST  MASTER  Output  on  the 

Kalman  Filter,  1  =  10 .  65 

21.  FORECAST  MASTER  Output  on  the 

Kalman  Filter,  1=4 .  67 

22.  Histogram  of  Original  Output  Distribution  ....  69 


V 


23.  Histogram  of  1=10  Interpolated  Data  Distribution.  70 

24.  Histogram  of  1=4  Interpolated  Data  Distribution  .  70 


25.  Autocorrelation  of  1  =  10  Data .  72 

26.  Autocorrelation  of  1=4  Data .  72 

27.  Error  Variance  Time  History  . .  76 


VI 


I 


List  of  Tables 


Table  Page 

1.  Programs  Used  and  Their  Purposes .  52 

2.  Batch  Means  Performance  Results  .  60 

3.  Independent  Replications  Performance  Results  .  62 

4.  Kalman  Filter  Performance  Results,  1=10  ....  66 

5.  Kalman  Filter  Performance  Results,  1=4  ....  68 

6.  Comparison  of  Mean  Values .  73 

7.  Differences  Between  Mean  Values  .  73 

8.  Coverage  Rates  with  Delta  Added  In .  74 


vi  1 


Abst  r  ac  t 

7ke'^'< 

The  purpose  of  this  r CMacor etr  was  twofold.  First,  to 
develop  software  which  automated  the  current  simulation 
output  analysis  techniques  of  batch  means  and  independent 
replications.  Second,  to  determine  the  applicability  of 
employing  Kalman  filtering  for  simulation  output  analysis  on 
an  M/M/1  queue. 

The  Output  Analysis  Program  incorporates  batch  means 
and  independent  replications  into  a  menu  driven,  micro¬ 
computer''  'program.  The  program  allows  the  user  to  quickly 
analyze  the  simulation  output  graphically,  discard  the 
transient,  and  then  perform  the  two  techniques. 

The  Kalman  Filter  proved  valuable  in  providing  insight 
into  the  underlying  system  variance  in  steady  state,  and 


appeared  to  offer  several  avenues  for  further  research, 
particularly  in  the  area  of  transient  identification. 


Vi  i  i 


COMPARISON  OF  BATCH  MEANS  AND  INDEPENDENT  REPLICATIONS 


TECHNIQUES  TO  APPLICATIONS  OF  THE  KALMAN  FILTER 
FOR  SIMULATION  OUTPUT  ANALYSIS 

I .  Introduct ion 

Aircraft  scheduling  and  the  repair  of  line  replaceable 
units  represent  Just  two  areas  from  a  vast  array  of  Air 
Force  production  processes.  One  problem  that  managers  of 
these  processes  face  is  deciding  which,  if  any,  of  the  many 
options  available,  will  increase  their  efficiency. 

Computer  simulation  is  a  technique  which  allows 
managers  to  evaluate  the  benefits  of  several  alternative 
courses  of  action  quickly.  In  a  discrete  event  simulation, 
the  system  to  be  studied  is  modelled  as  a  series  of  events 
which  occur  at  discrete  points  in  time  (22:381).  The 
performance  of  the  simulation  model  is  then  assumed 
representative  of  the  actual  system  through  its  warm  up 
period  and  into  what  is  known  as  steady  state. 

Steady  state  is  one  of  the  two  system  states  defined  by 
Pritsker  in,  Introduction  to  Simulation  and  SLAM  II.  It  has 
been  reached  when  the  system  is  oscillating  about  some  fixed 
mean  value.  The  second  state  is  known  as  transient.  It 
precedes  steady  state  and  reflects  the  system's  "warm-up" 
period  (22:44).  Data  from  this  state  is  typically  removed 
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before  most  output  analysis  techniques  are  employed. 

The  process  used  to  discard  the  transient  data  is  called,  by 
Pritsker,  "Data  Truncation"  and  the  following  quote,  from 
his  book,  highlights  some  of  its  limitations. 

The  intent  is  to  reduce  the  initial  condition  bias  in 
the  estimates  by  eliminating  values  recorded  during 
the  transient  period  of  the  simulation.  However,  by 
discarding  a  portion  of  the  data,  we  are  not  using 
observations  and,  hence,  may  be  increasing  the 
estimated  variance  of  the  mean  .  .  •  The  truncation 

point  is  selected  as  the  time  at  which  the  response 
'appears'  to  have  reached  steady  state.  (22:44) 

Problems  arise  in  deciding  the  point  at  which  the 
system  transitions  from  transient  to  near  steady  state,  the 
number  of  simulation  runs  required  to  attain  a  certain 
degree  of  accuracy  in  the  predictions,  and  the  required 
length  of  these  runs,  which  would  ensure  some  representation 
of  steady  state  had  been  achieved.  There  are  numerous 
methods  available  (22:752-757),  but  each  requires  the 
analyst  to  make  subjective  decisions.  It  is  this 
subjectivity  that  makes  it  possible  for  each  of  a  number  of 
different  modelers  to  calculate  their  own  best  estimate 
steady  state,  given  the  same  set  of  observation  data. 

There  are  also  two  main  types  of  simulations:  steady 
state  and  terminating  (22:733).  This  research  will  deal 
with  the  first,  where  the  system  is  assumed  to  have  run  long 
enough  to  obtain  an  asymptotic  distribution  of  the  target 
variate.  In  the  latter,  the  simulation  is  run  until  a 
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certain  event  occurs  or  for  a  pre-spec i f i ed  length  of  time. 

For  each  simulation,  the  number  of  runs  required  and 
the  run  length  are  both  determined  arbitrarily  by  the 
modeler.  There  are  heuristic  algorithms  to  aid  in  the 
determination  of  the  number  of  runs  required  for  a  certain 
accuracy.  These  are  based  on  the  results  of  a  number  of 
"pilot  runs."  These  values  are  influenced,  though,  by  the 
modeler’s  identification  and  truncation  of  the  transient. 
The  simulation  must  also  be  allowed  to  run  long  enough  to 
capture  the  extent  of  the  transient  period.  For  this 
reason,  initial  runs  are  typically  much  longer  than 
necessary,  and  thus  add  to  the  cost  of  the  study. 

Purpose 

The  purpose  of  this  thesis  was  twofold.  The  first 
objective  was  to  develop  a  computer  program  which  automated 
the  batch  means  and  independent  replications  techniques  for 
simulation  output  analysis.  The  second  was  to  determine  if 
Kalman  filtering  offered  any  efficiencies  in  determining 
steady  state  characteristics  of  a  class  of  queueing 
simulation  models. 

The  queueing  models  to  be  evaluated  will  exhibit 
Markovian  characteristics  in  their  arrival  and  service 
rates,  and  will  have  a  single  activity  with  one  server, 
referred  to  as  an  M/M/1  queue  (24:239).  The  term  Markovian 
indicates  that  "the  conditional  probability  of  the  future 
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X(t+5)  given  the  present  XCs)  and  the  past  X(u),  0£u<s, 
depends  only  on  the  present  and  is  independent  of  the  past" 
(24:234).  The  arrival  rate  represents  the  average  number  of 
entities  which  enter  the  system  over  a  specified  period.  In 
the  case  of  this  research  the  arrival  rate  is  10 
entities/hr.  The  service  rate  represents  the  average  number 
of  entities  that  a  particular  service  activity  within  the 
system  can  perform  service  on  during  the  same  time  interval. 
For  the  purpose  of  this  thesis,  there  is  a  single  service 
activity  with  a  mean  service  time  of  15  entities/hr. 

Arrival  and  service  rates  in  the  li/M/1  queue  are  identical 
to  birth  and  death  rates  described  by  Ross  (24:236-237). 

Kalman  filtering  (13:35;  16:4)  will  be  compared  to  the 

techniques  of  batch  means  and  independent  replications  in 
estimating  the  steady  state  mean  and  standard  deviation  of 
this  estimate,  as  reflected  by  957.  confidence  intervals. 

In  order  to  present  the  purpose  of  this  research 
better,  some  further  definition  is  required  for  the  terms 
batch  means,  independent  replications.  Kalman  filtering. 
time  series  models,  and  queueing  simulation  models. 

Batch  Means.  The  batch  means  procedure  uses  one  long 
simulation  run's  output  data  to  estimate  the  mean  and  of  the 
system,  and  this  estimate's  variance.  The  output  data  is 
sectioned  into  sequential  non-over  1 app i ng  "batches"  and  then 
tested  for  independence  between  the  successive  batches.  The 
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technique  is  described  in  detail  by  Welch  (27:307),  while 
the  necessary  details  and  equations  are  set  forward  in 
Chapter  1 1 . 

Independent  Replications.  This  procedure  also  computes 
estimates  of  the  mean  and  its  variance,  but  requires 
multiple  runs  to  be  accomplished.  The  mean  of  each  run  is 
computed  and,  using  equations  derived  by  Welch  (27:296),  the 
mean  and  its  variance  are  computed. 

Kalman  Filtering.  Kalman  filtering  (7:193;  13:35; 

16:4)  IS  a  technique  from  the  area  of  control  theory  which 
has  among  its  uses  the  prediction  of  space  vehicle 
trajectory  and  real  time  updating  of  inertial  navigation 
systems.  Miller  and  Leskiw's  book.  An  Introduction  to 
Kalman  Filtering  with  Applications,  provides  the  following 
expl anat ion. 


The  basic  idea  behind  Kalman  filtering  is  to  combine 
the  system  model  (the  differential  equation  governing 
the  flight  path  of  the  vehicle)  and  the  measurement 
model  (the  observations  made  on  vehicle  position)  in  an 
optimum  fashion  in  order  to  obtain  the  best  estimate  of 
the  position  of  the  vehicle  at  any  time  t  j  >t  , 

(20: V) . 


Time  Series  Models.  Time  series  models  (1:80)  predict 
an  object's  position  at  future  discrete  moments  in  time. 
They  are  based  on  a  culmination  of  previous  observations, 
and  predict  based  on  this  history.  The  number  of  previous 
observations  used  directly  in  the  computation  of  the  next 
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position  depends  upon  the  order  of  the  equation  used.  A 
second  order  model  implies  that  the  last  two  observed 
positions,  and  errors  from  their  associated  predictions, 
are  combined  to  form  the  next  step  prediction.  Chapter  II 
addresses  the  foundations  of  the  Auto  Regressive  -  Moving 
Average  (ARMA)  model,  and  its  application  to  Kalman  filter 
generat ion. 

Queueing  Simulation  Models.  Queueing  simulation  models 
(22:382)  are  those  models  which  evaluate  a  system  which 
operates  on  the  principle  that  some  entities  that  pass 
through  the  system  will  have  to  wait  in  a  line  (queue) 
before  they  can  receive  some  particular  service.  These 
models  are  used  to  assess  the  efficiency  of  the  system  in 
processing  entities  by  measuring  such  attributes  as  system 
production,  time  spent  in  the  system,  and  utilization  rates 
of  the  specific  serving  functions. 

Sub-obJec  t i ves 

In  determining  the  effectiveness  of  Kalman  filtering  in 
identifying  the  estimate  of  the  mean  and  its  associated 
standard  deviation  for  a  system  in  steady  state,  several 
sub-objectives  had  to  be  accomplished.  First,  a  computer 
simulation  model  of  the  M/M/1  queue  had  to  be  developed. 

The  M/M/1  queue  was  selected  for  its  well  known 
characteristics  (24:306-313)  and  the  mathematical 
tractability  of  its  steady  state  equations.  The  model  was 
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constructed  using  the  SLAM  II  simulation  1 anguage 
(22:62),  with  FORTRAN  inserts.  The  output  of  interest  from 
the  model  is  the  average  time  spent  in  the  system  for  each 
ent i t  y . 

In  order  to  assess  multiple  outputs  of  the  simulation 
model  quickly,  a  micro-computer  output  analysis  program  had 
to  be  developed.  This  program,  written  in  Turbo  Pascal 
(2:23),  employs  the  batch  means  and  independent  replications 
techniques.  The  program  drastically  reduces  the  computation 
time,  allowing  the  user  to  evaluate  the  effects  of  choosing 
different  transient  termination  points  quickly  and  easily. 

The  Kalman  filter  requires  that  observations  be  equally 
spaced  with  regard  to  time.  This  is  impossible  to  do  within 
a  simulation  with  Markovian  arr ival /service  rates. 

Therefore,  a  linear  interpolation  of  the  data  was 
accomplished,  using  both  system  clock  time  and  the  entities’ 
time  in  the  system. 

Using  the  equally  spaced  data,  Kalman  filter  matrices 
were  determined  using  the  State  Space  Modeling  option  in  the 
forecasting  package  FORECAST  MASTER  (9:1).  The  matrices 
generated  were  then  placed  into  the  software  package 
Multimode  Simulation  for  Optimal  Filter  Evaluation.  MSOFE 
(4:1),  and  used  to  determine  the  underlying  mean  and 
standard  deviation  of  the  system. 
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Overview  of  Chapters 


Subsequent  chapters  will  cover  the  thesis  development. 
Chapter  II  provides  the  historical  background  and  references 
for  the  techniques  to  be  employed.  It  also  provides 
derivations  for  the  most  commonly  used  and  important 
equations.  Chapter  III,  the  methodology,  outlines  the 
development  of  the  appropriate  computer  programs  and  models, 
and  their  employment  for  output  data  generation. 

Chapter  IV  provides  the  results  of  the  three  output 
analysis  techniques:  batch  means,  independent  replications, 
and  Kalman  filtering.  Quantitative  comparisons  are  based  on 
the  technique’s  ability  to  form  the  mean  and  a  957. 
confidence  interval  based  on  the  standard  deviation. 

Chapter  V  presents  conclusions,  founded  on  the  results  of 
Chapter  IV,  and  recommendations  for  follow  on  studies. 
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1 1 .  Historical  Development 


Bac  koround 

This  chapter  reviews  applicable  output  analysis 
and  Kalman  filtering  literature,  and  covers  eight  basic 
topics:  steady  state  vs.  terminating  simulations,  moving 
averages,  confidence  intervals,  data  truncation,  cubic 
spline  and  linear  interpolation,  batch  means,  independent 
replications,  and  Kalman  filtering. 

The  moving  averages  and  data  truncation  techniques  are 
used  to  manipulate  data  for  use  in  batch  means  and 
independent  replications.  Moving  averages  "level  out"  the 
spikes  in  the  output  data  by  averaging  across  observations 
(27:29).  Confidence  intervals  will  be  used  to  define  a 
range  of  plausible  values  fov  the  true  system  mean,  centered 
on  the  estimated  mean  of  the  observations.  The  hal fwidth  of 
the  confidence  interval  will  be  based  upon  the  variance  of 
the  observations. 

Data  truncation  refers  to  removing  the  transient  or 
"warm-up"  period  data  from  the  simulation  output,  leaving 
only  steady  state  data  (22:44).  This  is  done  to  provide 
stationary  output  to  be  evaluated. 

Recent  work  by  Kelton  (14:355-366)  indicates  that  there 
could  be  some  benefit  in  choosing  appropriate  initial  system 
conditions.  This  article  suggests  procedures  for 
identifying  reasonable  initialization  parameters  based  on 
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the  results  of  pilot  runs.  One  of  the  expected  benefits  of 
the  adaptive  Kalman  filtering  procedure,  however,  is  that 
once  the  filter  has  been  fit,  it  will  provide  an  accurate 
next  step  ahead  forecast  and  the  variance  of  the  forecast, 
even  with  the  transient  data  included.  To  this  end,  all 
simulation  runs  will  be  started  with  "empty  and  idle" 
conditions,  but  various  truncation  points  were  used  when 
applying  the  batch  means  and  independent  replications 
techniques  to  evaluate  the  effect. 

Independent  replications  and  batch  means  represent  two 
basic  methods  for  obtaining  estimates  of  the  steady  state 
mean  and  its  associated  variance  (22:732).  The  main 
difference  is  that  the  method  of  independent  replications 
requires  multiple  runs  of  the  same  simulation,  whereas  batch 
means  uses  one  very  long  run. 

These  techniques  are  used  to  evaluate  the  output  of 
computerized  simulation  models,  which  Welch  describes  in  the 
following  way: 


These  models  have  a  random  input  that  consists  of  a 
set  of  sequences  of  random  variables  whose 
distributions  are  specified.  .  .  Correspondingly, 

the  models  have  a  random  output  that  consists  of  a 
set  of  sequences  of  random  variables  .  .  .  whose 

distributions  are  unknown.  The  reason  for 
constructing  and  running  the  simulation  is  to 
estimate  certain  characteristics  of  these  output 
distributions.  The  model  is  simulated  because  these 
characteristics  cannot  be  computed  from  analytic 
results.  (27:268-269) 
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It  was  the  purpose  of  this  thesis  to  construct  a 
computer  software  program  which  automated  the  batch  means 
and  independent  replications  procedures  and  to  determine 
whether  Kalman  filtering  techniques  could  be  employed 
effectively  in  assessing  the  underlying  characteristics  of 
the  system  being  modelled. 


Steady  State  vs.  Terminat ino  Simul at  ions 


The  differences  in  these  two  types  of  simulation  models 
ar e  as  foil ows : 

Terminating  Simulations  -  Simulation  ends 
when  a  prespecified  event  or  condition  occurs 
or  a  certain  amount  of  time  elapses. 

Steady  State  -  Assumes  a  situation  that  can 
be  thought  of  as  being  able  to  run  long 
enough  that  an  asymptotic  distribution  of  the 
target  variate  is  (for  practical  purposes) 
real ized . 


This  thesis  dealt  with  the  case  of  steady  state 
simulations,  which  have  reached  an  asymptotic  approximation 
of  true  steady  state  behavior.  It  is  important  to  note  that 
true  steady  state  is  impossible  to  achieve  in  a  process 
defined  on  a  finite  time  interval. 


Moving  Averages 

This  technique  can  be  used  to  remove  large  fluctuations 
from  the  graphical  portrayal  of  the  output  data.  Performing 
a  moving  average  (27:293-294)  provides  a  new  "averaged" 
value  (p„)  for  each  of  the  N  original  data  points. 


These 


values  are  then  plotted,  and  reviewed  to  determine  trends  in 
the  true  system  mean,  p.  This  aids  the  experimenter  by 
making  it  easier  to  interpret  the  data  to  determine  the 
extent  of  the  transient  period  (27:293),  for  M  runs  of  the 
simulation.  Welch  addresses  the  technique  in  the  following 
way : 


.  .  .  sometimes  these  judgments  about  the  long  term 

trends  in  <p„>  are  easier  to  make  if  an  explicit 
attempt  is  made  to  smooth  out  the  short  term  (high 
frequency)  fluctuations  in  (p„>.  The  simplest  way 
to  do  this  is  to  take  a  moving  average  over  an 
interval  long  enough  to  remove  short  term 
fluctuations  but  not  so  long  as  to  distort  the  long 
term  trend.  (27:293-294) 

A  moving  average  of  length  2K  +  1  is  defined  as 


IK 

(2K  +  1)“  I  p  if  n  >=  K  +  1 

k=-K 

n-1 

(2n  -  !)•*  E  p  if  n  <  K  +  1  (1) 

k=-(n-l ) 


The  next  two  figures  present  a  graphical  comparison  of 
the  two  techniques.  Figure  1  depicts  a  representative 
computer  simulation  output  with  customer  time  in  system 
represented  on  the  Y-axis  and  the  number  of  the  entity  on 
the  X-axis.  Figure  2  shows  the  data  from  Figure  1,  but  with 
moving  average  values  from  a  window  with  a  K  value  of  5. 
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Confidence  Intervals 


Confidence  intervals  are  used  when  it  isn’t  possible  to 
arrive  at  an  exact  answer.  The  intervals  are  centered  at 
the  best  estimate  attainable,  and  then  a  "halfwidth"  value 
is  subtracted  and  added  to  form  the  interval.  The  halfwidth 
is  determined  by  three  factors:  the  number  of  observations 
to  be  used,  the  significance  level  (usually  denoted  by  a), 
and  the  standard  deviation  of  the  estimate  (6:266-268).  See 
Equation  (6)  below  for  the  actual  computation  (6:268). 

Equation  (6)  provides  a  halfwidth  which  will 
statistically  cover  the  mean  (l-a)*/.  of  the  time.  This  is 
because  the  t  value  is  derived  from  the  fact  that  the 
population  from  which  the  mean  was  estimated  is  assumed  to 
have  a  normal  distribution.  In  most  cases  the  true  variance 
of  the  system  is  unknown  and  so  the  standard  deviation  (ir) 

IS  approximated  by  the  standard  deviation  of  the  estimates 
(s)  . 

In  general,  when  estimating  confidence  intervals,  the 
equations  below  are  used.  In  keeping  with  Welch’s  notation 
the  number  of  runs  is  represented  by  "M",  the  number  of 
observations  per  run  by  "N,",  and  the  n*"  observation  of  the 
m*"  run  by  (27:290). 
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Then  the  mean  of  the  m*"  run  can  be  computed  as; 


_  N 

p.  =  V  ,  =  1/M  Z  V  ,  m  =  (3) 

n=  1  mn 

Using  the  mean  values  for  each  run,  the  variance  and 
standard  deviation  can  be  determined  using  the  following 
equations  (27:294). 

N 

Var.  =  s.*  =  1/(M-1)  £  CV.„  -  p.)*  (4) 

n=l 

Std  Dev„  =  s„  =  (var„)‘'*  (5) 

These  values  are  then  combined  with  the  t  critical 
value  to  determine  the  boundaries  of  the  confidence  interval 
using  the  equation  below,  where  (1  -  a)  represents  the 
confidence  coefficient  and  (n  -  1)  indicates  the  degrees  of 
freedom  for  the  t  value  (6;268). 

Hal  fwidth  =  x  ±  *  s/Tn  (6) 

Data  Truncation 

This  technique  eliminates  the  transient  period  data 
within  a  run.  In  this  application,  the  cutoff  value  is 
determined  by  the  experimenter,  based  on  a  graphical 


5 


display,  as  shown  above  in  Figure  1. 

Once  the  cutoff  value,  n,  ,  is  established,  all 
observations  previous  to  this  point  (from  1  to  n,  )  are 
deleted  across  the  M  runs.  This  leaves  the  data  which 
represents  the  system’s  performance  in  steady  state 
(27:289) . 

As  mentioned  previously,  Pritsker  highlights  some 
limitations  with  this  technique.  "The  truncation  point  is 
selected  as  the  time  at  which  the  response  ’appears’  to 
have  reached  steady  state"  (22:44). 

This  arbitrary  selection  can  lead  to  various  problems 
as  noted  by  Kleijnen; 


Practitioners  often  throw  away  the  Cinitial3  part  of 
the  time  series;  that  is,  they  "warm-up"  the 
simulation  before  they  start  recording  observations. 
Unfortunately,  two  practical  problems  remain. 

1.  How  can  we  determine  whether  the  transient  phase 
is  over'5 

2.  Throwing  away  the  initial  phase  of  each  run 
wastes  computer  time. 

Practitioners  often  construct  graphs  (and  making 
graphs  is  always  an  excellent  idea  in  any  statistical 
experiment)  to  see  whether  start-up  effects  have 
"obviously"  disappeared.  .  .  If  we  overestimate  the 

length  of  the  transient  phase,  we  throw  away 
information  on  the  steady  state  and  increase  the 
variance  of  the  final  estimator.  If  we  do  not  wish 
to  waste  computer  time,  we  may  be  tempted  to 
underestimate  the  initial  phase  and  we  bias  the  final 
estimator.  (15:65) 
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Cubic  Spline  and  Linear  Interpolation 


The  simulation  output  data  provided  by  the  SLAM  II 
model  is  unequally  spaced  in  time.  This  is  caused  by  the 
exponential  nature  of  the  arrival  and  service  rates.  In 
order  to  employ  the  adaptive  Kalman  filtering  program 
FORECAST  MASTER,  however,  the  input  data  is  assumed  to  be  at 
regular  time  intervals. 

Two  methods  were  identified  to  transform  the  output 
data  into  these  regular  intervals.  The  first,  cubic  spline 
interpolation,  fits  a  continuous  function  through  the  data 
points.  This  function  passes  through  each  of  the  data 
points  and  is  continuously  differentiable-  Burden  and 
Faires  (3:118)  provide  the  following  explanation: 

A  general  cubic  polynomial  involves  <'our 
constants;  so  there  is  sufficient  flexibility  in 
the  cubic  spline  procedure  to  ensure  not  only  that 
the  interpolant  is  continuously  differentiable  on 
the  interval,  but  also  that  it  has  a  continuous 
second  derivative  on  the  interval.  The 
construction  of  the  cubic  spline  does  not, 
however,  assume  that  the  derivatives  of  the 
interpolant  agree  with  those  of  the  function,  even 
at  the  nodes. 


Burden  and  Faires  also  provide  a  derivation  of  the 
equations  used  to  form  the  cubic  spline  (3:118-120),  along 
with  several  computer  algorithms  for  different  applications. 
The  program  MATRIX,  (12:4-34)  was  used  to  obtain  the  cubic 
spline  for  the  unequally  spaced  output  data. 

The  second  method  used  was  that  of  piecewise  linear 
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interpolation.  This  procedure  connects  adjacent  data  values 
by  straight  lines,  and  then  computes  equally  spaced 
observations.  In  this  experiment  a  regular  time  interval  of 
10  minutes  was  used.  Appendix  D  contains  a  copy  of  the 
program  used  to  obtain  the  linear  data  and  some  sample 
output.  Data  points  were  interpolated  for  the  first  1000 
data  points  in  each  of  the  50  output  data  files. 

Independent  Replications 

Also  called  "replications,"  this  technique  deals  with 
multiple  runs  of  the  same  simulation,  assuming  independence 
between  the  runs.  The  method  of  replications  calculates  an 
estimate  of  the  mean  and  its  corresponding  variance 
(22:732) . 

After  the  transient  data  has  been  removed  from  each  of 
the  M  runs,  the  remaining  values  all  have  approximately  the 
same  mean,  which  can  be  computed  with  the  following 
equations  (27:29&). 

In  the  method  of  independent  replications, 
independent  sample  means  are  generated  from 
independent  simulation  runs.  A  point  estimate  and  a 
confidence  interval  are  generated  from  these  sample 
means  as  we  shall  now  describe  .  .  . 

On  each  replication  we  form  the  sample  mean 

N 

p,  =  l/(N-ng  )  E  V,„  (7) 

n=n,+ 1 

.  .  .  approximately  unbiased  estimator  of  p  is 
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(8) 


M 

)j  =  1/M  Z  ja. 

m=  1 


Moreover,  if  we  let 


M 

1/(M-1)  E  jji.* 
m=  1 

-  M/(M-1)  p*  (9) 

.  .  .  we  obtain  the  confidence  interval  and 

statement 

ProbCjj  -  t„.,  (  l-a/2)  (s(ja.)/M‘'*  <  p  < 

p  +  t„.t  (  l-o(/2)s(p.) /M‘'*>  =  1-a  (10) 

According  to  Pritsker  the  technique  has  the  following 
advantages  and  disadvantages. 

The  replication  procedure  has  the  desirable  property 
that  samples  are  independent.  Another  advantage  is 
that  it  can  be  used  for  both  terminating  and  steady 
state  analysis  where  a  terminating  analysis  is  one 
that  is  performed  for  a  specific  finite  time  period. 
The  disadvantages  associated  with  replications  are: 

1)  each  replication  contains  a  startup  segment  which 
may  not  be  representative  of  stationary  behavior;  and 

2)  only  one  sample,  X,  ,  is  obtained  from  each 
replication  which  could  mean  that  extensive 
information  about  the  variable  of  interest  is  not 
being  gleaned  from  the  data.  (22:733) 

Batch  Means 

This  technique  deals  with  the  case  of  using  only  one 
long  run  of  output  data.  According  to  Welch  it  "exactly 
parallels  the  method  of  independent  replications  except  that 


M 

s*  (p.  )  =  1/(M-1)  E  (p.  -  p)*  = 

m=  1 
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the  sequences  are  adjacent,  non-overlapping  subsequences  of 
the  output  of  a  single  simulation  run"  (27:307). 

The  most  important  part  is  to  determine  the  batch  size, 
m,  necessary  to  indicate  independence  among  the  batches 
(8:511).  Once  this  independence  is  identified,  the  batches 
can  be  compared  as  if  they  were  separate  runs  (5:303). 

Chen  and  Seila  state  the  ",  .  .  batch  means  method 

depends  .  .  .  upon  establishing  a  batch  size  that  is  large 

enough  that  all  batches  are  approximately  uncorrelated" 
(5:302).  A  method  which  tests  this  autocorrelation  between 
the  batches  was  developed  by  Tishman,  and  uses  a  test 
statistic  C,  (8:514).  When  the  autocorrelation  effect  is 
larger  than  the  statistic,  it  indicates  that 

independence  between  the  batches  is  reasonable. 

Fishman's  work  uses  the  symbol  Y  to  represent 
individual  observations.  To  keep  with  the  notation  used 
earlier  in  the  text,  the  following  equations  will  have  the 
constant  V  in  place  of  Y.  Within  the  equation,  k  is  the 
total  number  of  batches,  and  m  is  the  number  of  observations 
per  batch  (8:514): 

k-1  k 

C»  =  1  -  E  (V,..  -  V,„,,  )*/2  E  (V.,.  -  p„)*  (11) 

1=1  ’  ’  i = 1 

The  algorithm  starts  by  setting  the  number  of  batches 
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equal  to  the  number  of  observations.  It  then  determines 
values  for  the  test  statistic.  If  the  values  indicate 
independence,  then  the  testing  stops  and  a  mean  and 
confidence  interval  are  computed.  If  the  values  do  not  pass 
the  test,  then  the  number  of  batches  is  divided  by  two 
(k/2),  resulting  in  a  doubled  batch  size.  The 
autocorrelation  and  test  statistic  are  then  compared,  and 
the  process  continues  until  the  values  pass  the  independence 
test  or  the  number  of  batches  becomes  less  than  eight 
(8:515).  In  explaining  why  at  least  eight  batches  are 
required,  Fishman  references  Von  Neumann’s  work,  stating  a 
comparison  at  the  .05  significance  level  for  C»  with  k  >=  8 
"indicates  negligible  error"  (8:514). 

Chen  and  Seila  identify  the  following  equations  for 
computing  the  mean  for  the  data  from  the  independent 
batches.  The  constant  "i"  represents  the  number  of  the 
batch,  while  k  is  the  number  of  batches  (5:302). 

The  i’"  batch  mean,  then  is 
_  k 

X.  =  1/k  Z  X. (12) 

J  =  1 

.  .  .  the  point  estimator  for  p  is  the  sample 

mean  of  the  batch  means 

m  _ 

M  =  1/m  E  X,  (13) 

1  =  1 


21 


The  standard  deviation  of  the  estimator  and  its 


confidence  interval  can  be  calculated  with  Eqs.  (9)  and  (10) 
above  by  substituting  m  for  M. 

Pritsker  refers  to  this  technique  as  "sub  interval s"  and 
highlights  the  following  advantages  and  disadvantages. 


The  advantages  of  using  subintervals  to  estimate  the 
variance  of  the  sample  mean  are  that  a  single  run  can 
be  used  to  obtain  an  estimate  and  only  one  transient 
period  is  included  in  the  output  (or  required  to  be 
deleted).  The  disadvantage  of  the  procedure  is  in 
establishing  the  batch  size,  b,  which  makes  the 
subintervals  independent.  (22:735) 


Kalman  filtering 

Kalman  filtering  is  described  by  Maybeck  (16:4)  as  an 
"optimal  recursive  data  processing  algorithm."  In  his 
text,  Stochastic  Models  Estimation,  and  Control,  he 
describes  the  Kalman  filter’s  underlying  assumptions  (16:7). 


A  Kalman  filter  performs  this  conditional 
probability  density  propagation  for  problems 
in  which  the  system  can  be  described  through 
a  1  inear  model  in  which  system  and 
measurement  noises  are  wh i t e  and  Gauss i an  (to 
be  explained  shortly).  Under  these 
conditions,  the  mean,  mode,  median,  and 
virtually  any  reasonable  choice  for  an 
"optimal"  estimate  all  coincide,  so  there  is 
in  fact  a  unique  "best"  estimate  of  the  value 

of  X  . 


According  to  Mavbeck. "whiteness  implies  that  the  noise  value 
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is  not  correlated  in  time,"  and  being  Gaussian  implies  that, 
"at  any  single  point  in  time,  the  probability  density  of  a 
Gaussian  noise  amplitude  takes  on  the  shape  of  a  normal  bell 
shaped  curve."  One  of  the  greatest  strengths  of  the  Kalman 
filter  is  its  versatility  once  these  three  assumptions  are 
made.  It  can  be  implemented  over  a  wide  spectrum  of 
applications  from  many  fields. 

Recent  work  by  the  U.S.  Army  at  Fort  Monmouth,  New 
Jersey  (26:9-32),  applied  Kalman  filtering  to  estimating 
"clocks."  This  study  indicated  the  benefit  of  Kalman 
filtering,  comparing  it  to  standard  auto  regressive 
integrated  moving  average  (ARIMA)  models. 

Static  Case.  In  an  article  entitled  Understanding  the 
Kal man  Filter.  Meinhold  and  Singpurwalla  highlight  the  fact 
that  most  of  the  documentation  in  this  area  has  been  done  by 
the  engineering  and  scientific  community.  Therefore,  the 
notation  and  methodology  are  typically  difficult  to  relate 
with  common  statistical  formulae  (19:123).  They  have  taken 
the  basic  concepts  of  the  Kalman  filter  and,  using  Bayesian 
formulation  and  multivariate  statistics,  have  provided  a 
statistical  derivation  of  this  important  process  in 
statistical  terms. 

Meinhold  and  Singpurwalla  provide  a  clear  path  to 
follow.  They  develop  the  concept  of  the  Kalman  filter  as 
the  combination  of  system  and  observation  equations. 
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The  following  quote  identifies  the  components  of  the 
system/observat ion  equations  and  their  significance: 


Let  Y,  ,  Yj,j  ,  .  .  .  ,  Y,  ,  the  data  (which  may  be 
either  scalar  or  vectors),  denote  the  observed 
values  of  a  variable  of  interest  at  times  t, 
t-1,...,  1.  Ule  assume  that  Y,  depends  on  an 

unobservable  quantity  9, ,  known  as  the  state  of 
nat ure .  Our  aim  is  to  make  inferences  about  9,  , 
which  may  be  either  a  scalar  or  a  vector  and  whose 
dimension  is  independent  of  the  dimension  of  Y,  . 
The  relationship  between  Y,  and  9,  is  linear  and 
is  specified  by  the  observation  equation 


Y,  =  F,  9.  +  V, 


(14) 


where  F,  is  a  known  quantity.  The  obser vat i on 
error  v.  is  assumed  to  be  normally  distributed 
with  mean  zero  and  a  known  variance  V,  ,  denoted 
as  V,  5:  N  (0,  V,  )  . 

The  essential  difference  between  the  [Kalman 
filter]  and  the  conventional  linear  model 
representation  is  that  in  the  former,  the  state  of 
nature  -  analogous  to  the  regression  coefficients 
of  the  latter  -  is  not  assumed  to  be  a  constant 
but  may  change  with  time.  This  dynamic  feature  is 
incorporated  via  the  system  equation,  wherein 

9,  =  G,9,.,  +  w,  (15) 


G,  being  a  known  quantity,  and  the  system  equation 
error  w,  ^  N  (0,  W,  ),  with  W,  known... 

In  addition  to  the  usual  linear  model 
assumptions  regarding  the  error  terms,  we  also 
postulate  that  v,  is  independent  of  w,  . 


To  provide  an  example,  Meinhold  and  Singpurwalla 
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present  the  tracking  of  a  satellite’s  orbit  (19:124).  This 
provides  some  insight  as  to  the  physical  relationships 
within  the  equations. 


The  unknown  state  of  nature  9,  could  be  the 
position  and  speed  of  the  satellite  at  time  t, 
with  respect  to  a  spherical  coordinate  system  with 
origin  at  the  center  of  the  earth.  These 
quantities  can  not  be  measured  directly.  Instead, 
from  tracking  stations  around  the  earth,  we  may 
obtain  measurements  of  distance  to  the  satellite 
and  the  accompanying  angles  of  measurement;  these 
are  the  Y,’s.  The  principles  of  geometry,  mapping 
Y,  into  0,  ,  would  be  incorporated  in  F,  ,  while 
V,  would  reflect  the  measurement  error;  G,  would 
prescribe  how  the  position  and  speed  change  in 
time  according  to  physical  laws  governing  orbiting 
bodies,  while  w,  would  allow  for  deviations  from 
these  laws  owing  to  such  factors  as  nonuniformity 
of  the  earths  gravitational  field,  and  so  on. 


The  key  to  the  Kalman  filtering  algorithm  lies  in  its 
"recursive"  nature.  Future  predictions  are  based  only  on 
the  last  state  and  its  corresponding,  cumulative  covariance 
kernel,  a  matrix  containing  the  covariance  estimates  for  the 
states  of  the  process.  The  process  is  completely  defined  by 
its  first  two  moments,  due  to  the  underlying  assumption  that 
the  process  can  be  represented  by  a  linear  system  driven  by 
white,  Gaussian  (normal)  noise. 

The  main  intent  of  Meinhold  and  S i ngpur wal 1  a ’ s  article 
is  to  derive  inferences  about  0,  ,  given  the  input  values  Y, 
using  Bayesian  statistics.  The  following  will  overview  the 
derivation  of  the  Kalman  filtering  equations,  while 
expanding  on  some  of  the  actual  calculations. 
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The  derivation  through  the  direct  application  of  Bayes 


theorem: 


ProbfState  of  Nature|Data>  a  ProbfData | State  of  Nature} 

X  ProbfState  of  Nature} 


or 


P(e,|Y,)  a  P(Y,|e,,Y..,)  X  P(e,|Y..i) 


wher  e 


Y,-.  =  <Y..,,  Y,., 


Y.) 


(16) 


As  noted  by  Meinhold  and  Singpur wal 1  a,  the  left  side  of 
the  equation  represents  the  posterior  distribution  for  9  at 
time  t,  while  the  terms  on  the  right  hand  side  represent  the 
1 1 kel i hood  and  prior  distributions  respectively. 

At  this  point,  knowledge  of  the  value  of  9,.,  can  be 
stated  in  a  probability  statement,  where  9  is  the  best 
estimate  of  the  system  state  at  the  last  time,  and  E  is  the 
associated  covariance  kernel  of  this  estimate: 

^®.-i|Y,.,)  5:  N<e,.,,E,.,)  (17) 

The  recursive  estimation  begins  at  time  O,  with  initial 
estimations  9,  and  E,  .  The  process  then  looks  forward  to 
system  performance  a  time  t  in  the  future.  This  is  done  in 
two  increments:  the  first  is  a  propagation  of  the  initial 
estimates,  the  second  an  update  based  on  the  actual 
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obser  vat i on . 


In  the  propagation  phase  of  the  est  i  inat  i  on ,  the  best 
choice  for  the  value  of  ©,  is  given  by  the  system  equation 
'»'*■  Based  on  the  well  known  statistical 
rel at lonsh ip : 

X  N(p,I)  =>  CX  N(Cm,CEC')  (18) 

the  mean  and  variance  of  0,  can  be  seen  to  be 

(e,jY,.,)  N<G,e,.,,R  ,=G  +W,)  (19) 

which  represents  the  prior  distribution. 

Upon  making  the  observation  at  time  t,  the  posterior 
probabilities  must  now  be  computed.  This  requires  the 
estimation  of  the  likelihood  L(©,|Y,),  or  equivalently 
P(Y,  I©,  ,Y,.,).  This  estimation  follows,  where  e,  represents 
the  error  in  predicting  Y,  from  the  point  t-1,  also  known 
as  the  residual: 


=  Y,  -  Y,  =  Y,  -  F,  G,©,., 


(20) 


As  F,  ,  G,  ,  and  ©,.,  are  all  known,  knowing  Y,  is 
equivalent  to  knowing  e,  .  With  this  information,  the 
probability  statement  is  now  rewritten; 


P(©,|Y,,Y..,) 


P(e,|e.,Y,.,)  o  P(e,j9»,Y,.^) 


X  PO.jY,.,) 


(21 ) 
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with  P(e,|0,  being  the  likelihood. 

Now,  incorporating  the  observation  to  replace  Y,  in 
the  equation  for  e,  above,  the  result  is  show  to  be 


e,  =  r.ce,  -  G,e,.,)  +  V,  (22) 

making  the  value  of  E(e,|et  ,Y,.,)  =  F,(e,  -  G,6,.,). 

As  the  distribution  of  v,  has  been  assumed  to  be 
N(0,V,  ),  it  can  be  seen  that  the  likelihood  can  be 
described  as  follows: 

(e,|e,  ,Y,.,)  N<F,<e.  -  G,e..,),V,)  (23) 

Oayes’  theorem  now  results  in 


P(e,  0,,Y,.,)  X  PC0,[Y,.,)  (24) 

P<e.iY.,Y,.,)  =  - - - : - 

3  P(e,,0,|Y...)d0, 

This  is  the  best  estimator  of  0,  at  time  t.  Once  this 
value  is  computed,  the  process  is  repeated  for  a  point,  time 
t  in  the  future. 

Meinhold  and  Singpurwalla  note  (19:125)  that  the  effort 

to  determine  P(0,|Y,)  above  could  be  reduced  by  using  well 

known  results  from  multivariate  statistics. 

Let  X,  and  X,  have  a  bivariate  normal  distribution 
with  means  p,  and  p,,  respectively,  and  a 
covariance  matrix 


We  denote  this  by 


^ti 


(25) 
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(26) 


When  C26]  holds,  the  conditional  distribution  of 
Xj  given  Xj  is  described  by 

(X,|X2  -rj)  ~  +  ^12^21  ^  *2~M2^  »  ^  ll~^  12^  22  ^21^ 

(27) 

The  quantity  p,  ■*■  ^  12  ^  22  ^’>^2  -  p,)  is  called  the 

regression  function,  and  E,2  E22’’  referred  to  as 
the  coefficient  of  the  least  squares  regression  of 

Xj  •  •  • 

For  our  situation,  we  suppress  the  conditioning 
variables  Y,.,  and  let  X,  correspond  to  e,,  and  X2 
correspond  to  0,  ;  we  denote  this  correspondence 
by  X,  <  =  >  e,  and  X2  <  =  >  ©,.  Since 

(©,jY,.,)  N(G,e,.,  ,R,)  (see(19)) 

we  note  that 


P2  G*0,., 


and 


E22  ^  —  ^  R* 


If  in  (26)  we  replace  X,  ,  X,  ,  p,  ,  and  Ex2  *^y 
9,  ,  G,  9,.,,  and  R,  ,  respectively  and  recall  the 
result  that 

(e,|0,,Y,.,)  N(F,(©,-G,©,.,)  ,  V.)  .  (20) 

With  C28D  and  the  fact  that 

0,  5:  N(G.©,.,,R,) 

the  expected  value  for  (27)  can  be  computed  as 


E(X,jX2— Xx)  —  p,  +  Ejx  I^xt  ^*2  M2^ 


where,  substituting  in  for  X,  ,  Xx  ,  M2  »  ^22 


(29) 
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E(e,|e.  ,Y,.,  )  =  p,  +  r.j  R,-‘  ce,  -G.  e,.,).  (30) 

Comparing  this  to  (28)  results  in 

Pi  +  ^12  -  G.©.-i>  =  -  G,e,.,),  (31) 

which  impl  ies  that  p,  =  0,  and  E,,  =  F,  R,. 

With  this  information,  the  variance  of  this  function  is 
computed  as  follows,  where  R,  is  symmetric: 

Var(e,|0,,Y,.,)  =  E,,  “  ^12^ 22  ^21  (32) 

=  E.,  -  F.  R,  R.-'  R,  F,' 

=  E„  -  F,  R,  F/ 

From  (28),  this  variance  equals  V,,  therefore 

V,  =  E„  -  F,  R.  F/  (33) 

or,  equivalently 

1:11  =  V,  +  F,  R,  F/  (34) 


Substituting  these  results  into  (26)  gives 


Now  substitute  into 

(0,|e,  »Y,.,)  ^  N(p,  +E12  ^22  ^^2~  P2^> 

j:.,  -  2.2  E22”  J:2,)  (35) 

and  derive  the  following  expressions  for  mean,  ©,,  and 
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variance,  E,. 


e,  =  Ece.je,)  =  G,  e,.,  +  R,  F,»  (V,  +  F,  R,  F,’)-‘  e,  (36) 

E,  =  Var(e,|e,)  =  R,  -  R,  F,'  (V,  +  F,  R,  F/  ) '*  F,  R,  (37) 

where  R,  F,’  (V,  +  F,  R,  F,’)"*  represents  the  Kalman  filter 
ga i n ,  K . 

Now  that  the  values  for  6,  and  E,  have  been  derived, 
all  that  is  required  is  to  specify  a  system’s  G  and  F 
matrices  and  the  initial  values  for  and  E^.  The  program 
FORECAST  MASTER  has  the  same  type  of  Kalman  filtering 
approach,  but  uses  the  "innovations  form"  of  the  Kalman 
filter  (9:7-6),  as  shown  on  page  34. 

Adaptive  Filtering.  Adaptive  Kalman  filtering  falls 
typically  into  one  of  two  styles.  The  first  incorporates  a 
bank  of  Kalman  filters  which  are  each  built  for  a  particular 
type  of  system  performance,  such  as  steady  state,  a 
transient  value,  or  a  step  change.  Harrison  and  Stevens 
(10:343-344)  apply  this  strategy  in  short  term  forecasting, 
using  a  Bayesian  approach. 

The  filters'  outputs  are  then  fed  into  a  controller 
which  evaluates  the  residual  values  from  each  filter.  The 
controller  then  selects  the  best  filter  based  on  the  lowest 
residual  magnitude,  relative  to  the  filter-computed  residual 
covariance  CV,  +  F,  R,  F,’D.  This  process  is  repeated  at  each 
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covariance  CV,  +  F,  R,  F,’  D.  This  process  is  repeated  at  each 
input  value,  allowing  the  system  to  react  instantly  to  state 
changes  in  the  data. 

The  second  style,  and  the  one  used  in  this  thesis, 
depends  upon  a  single  Kalman  gain  matrix,  generated  by 
maximum  likelihood  estimations.  The  process  to  be  used 
stems  from  the  work  of  Mehra  (18:175-184),  and  uses  maximum 
likelihood  methods  to  reduce  the  residual  values  to  the 
level  of  white  noise.  This  was  done  by  fitting  the  ARMA 
model  which  had  the  residual  values  most  in  consonance  with 
the  filter-computed  covariance  CV,  +  F,  R,  F,’3,  and  then 
representing  this  model  in  state  space  form. 

ARMA  models  combine  the  aspects  of  two  different  types 
of  times  series  models,  autoregressive  and  moving  average, 
to  obtain  the  most  parsimonious  representation  of  the 
system.  The  autoregressive  (AR)  model  is  based  on  the 
findings  of  Yule  and  Mold,  as  referenced  by  Abraham  and 
Ledolter.  The  process  is  based  on  the  observation  that 
"every  weakly  stationary  nondetermin i st i c  stochastic  process 
(z,  -  p)  can  be  written  as  a  linear  combination  (or  linear 
filter)  of  a  sequence  of  uncorrelated  random  variables." 
(1:197)  The  order  of  the  model  determines  the  number  of 
terms  required.  For  the  AR  model,  the  order  is  typically 
denoted  by  the  letter  "p."  Equation  (38)  shows  the  general 
equation  for  an  AR(p)  process: 
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5i  z.-i 


+  a, 


(38) 


z. 


5, 


z 


*-p 


where  the  value  a,  represents  the  random  shock  at  time  t, 
from  "a  sequence  of  uncorrelated  random  variables  from  a 
fixed  distribution  with  mean  E(a,)  =  0,  and  variance  V(a*)  = 
a*."  (1:197)  The  values  of  the  ij  expressions  represent  the 
coefficients  associated  with  the  previous  p  observations. 

The  moving  average  (MA)  model  is  based  on  using  the 
previous  "random  shock"  values,  a,,  in  conjunction  with 
their  appropriate  weighting  coefficients.  These 
coefficients,  are  usually  denoted  as  0,,  but,  to  avoid 
confusion  with  those  of  the  steady  state  Kalman  filter,  the 
variable  t,  will  be  used  instead.  Equation  (39)  represents 
the  general  MA  (q)  model,  where  "q"  denotes  the  model's 
order  (1:217): 


z,  p  a,  Xj  ...  (39) 

Although  either  of  these  techniques  can  be  used  to 
model  a  system,  the  most  parsimonious  representation  is 
usually  a  combination  of  the  two,  in  the  form  of  the  ARMA 
model,  where  a  "backshift  operator  is  employed.  This 
operator,  "B, "  denotes  stepping  backward  through  time.  In 
general  Bz,  =  z,.,,  and  B*  z,  =  z,.,.  The  general  ARMA  (p,q) 
model  is  shown  below  in  equation  40  (1:222). 
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(1  -  S,  B  -  ...  -  5,  B‘’)(z*  -  JJ)  = 

(1  -  t,B  -  ...  -  T,  B")  a,  (40) 


Once  the  optimal  ARMA  model  has  been  derived,  it  is 
placed  into  a  state  space  form.  For  details  on  this 
procedure  see  Harvey  (11:101-104),  Shea  (25:92),  or  Abraham 
and  Ledolter  (1:359-360).  Abraham  and  Ledolter  use  S,  to 
represent  the  system  states,  for  which  6,  will  be 
substituted  to  keep  with  previous  notation.  Looking  at 
their  general  case  for  an  ARMA  (p,q)  model  (1:360),  where 
CG]  is  an  identity  matrix  and  there  is  no  measurement  noise: 


y,  =  5,  +  ...  +  +  a,  -  T,a*., 


T,a..,  (4  1) 


the  state  model  is  given  by 

y,  =  Cl  0  0  0  ...  030, 


m 

•  — 

•  • 

1 

I.-, 

©.,2 

-T| 

• 

— 

• 

+ 

• 

i,  0  . .  .  0 

a 


**i 


(42) 


The  program  FORECAST  MASTER  then  used  these  values  to 
form  the  appropriate  Kalman  filter  matrices  for  computation, 
using  the  Goodr i c h-Lar i mor e  algorithm  (9:7-8),  a  variation 
of  the  Akaike-Mehra  algorithm,  to  identify  the  best  values 
of  state  transition  G,,  Kalman  gain  K,  and  measurement 
(forecast)  matrix  F,  of  Eqs.  (36)  and  (37).  The  format  used 
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by  FORECAST  MASTER,  however,  was  the  al gebr a i c al 1 y 
equivalent  "innovations  form"  of  the  Kalman  filter  (9:7-6). 
This  form  is  similar  to  the  equations  derived  above,  but 
results  in  the  following  equations: 

0,  =  G,  ©*-,  CGjKDv, 

Y,  =  F,e,  S,v, 


where 


S,v,  =  Y,  -  F,e,  =  r, 

1/2 

s,  =  cv,  +  ] 

V,  =  S/‘  CY,  -  F,©,3 
CG,K],  =  G,  R,  F/  (S/  )  * 

resulting  in  the  equations: 

0,  =  G,0,.,  +  CG,KD,S,  r, 

and 

Y»  =  F,©,  +  r, 

Summary 

There  are  several  techniques  currently  available  to 
evaluate  the  mean  and  variance  of  simulation  output.  The 
benefits  of  these  procedures  include  their  mathematical 
tractability  and  their  easy  application  of  classical 
statistics.  Drawbacks  include  the  confusion  on  the  length 
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of  the  transient,  and  coverage  rate  of  the  system  mean  given 
bias. 
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III.  Methodol oqv 


Overview 

This  chapter  will  cover  the  development  of  the  methods 
used  for  this  thesis.  In  generating  the  computer  software 
package  which  automated  the  output  analysis  techniques  of 
batch  means  and  independent  replications,  the  intent  was  to 
provide  the  analyst  with  a  quick,  easy-to-use  analytic  tool. 
This  program  was  then  employed  to  evaluate  the  possible 
benefits  of  Kalman  filtering  for  performing  the  same  tasks. 
The  measure  of  effectiveness  was  each  technique's  ability  to 
calculate  the  mean  and  confidence  interval  halfwidth,  based 
on  the  standard  deviation.  The  impact  of  choosing  the 
cutoff  value  for  the  transient  period  was  also  addressed. 

There  are  four  main  topic  areas  in  this  chapter:  model 
formulation,  output  analysis  programming,  the  application  of 
Kalman  filtering,  and  data  formatting. 

Model  formulation 

In  order  to  compare  the  different  techniques,  an  M/M/1 
simulation  queueing  model  was  developed  using  SLAM  II. 

Figure  3  shows  a  flowchart  of  the  M/M/1  queue,  while  the 
SLAM  II  and  FORTRAN  insert  programs  are  contained  in 
Appendix  A. 

Looking  at  the  individual  components  in  Figure  3,  there 
are  four  distinct  system  "nodes"  and  one  service  activity. 
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The  first  node  encountered  is  a  create  node,  which  generates 
each  of  the  entities  which  will  pass  through  the  system,  at 
a  specified  "arrival  rate."  In  this  research  the  arrival 
rate  has  been  selected  to  be  exponential,  averaging  10 
entities  per  hour.  This  results  in  an  entity  being  created, 
on  the  average,  every  6  minutes.  The  entity  then  proceeds 
to  an  infinite  capacity  queue  node,  where  it  waits  for  the 
service  activity  to  be  free.  The  waiting  criteria  for  this 
node  is  First-In-First -Out  (FIFO).  As  the  service  activity, 
located  between  the  queue  and  event  nodes,  becomes  free,  the 
next  entity  in  the  queue,  if  there  is  one,  enters  the 
activity  for  service.  Service  activities  can  have  single  or 
multiple  "servers"  which  perform  the  service.  In  this 
instance  there  is  only  one  server,  and  so  all  entities  must 
pass  through  the  activity  one  at  a  time.  The  "service  rate" 
for  the  server  represents  the  average  production  over  time. 
In  this  case,  the  service  rate  is  exponential  with  a  mean 
value  of  15  entities/hr. 

Upon  completion  of  the  activity,  the  entity  enters  the 
event  node.  At  this  point,  the  FORTRAN  subroutine  is 
accessed  to  build  the  file  containing  the  actual  system  time 
(TNOW)  and  the  time  the  entity  has  spent  in  the  system  (TNOW 
-  Time  of  Creation).  After  this  is  accomplished  the  entity 
is  removed  from  the  system  by  the  terminate  node. 
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M/M/1  QUEUE 

RHO  «  .667 


activity 


terminate 


Figure  3.  Flowchart  of  M/M/1  Queueing  Simulation 

The  model  was  run  50  times,  with  four  replications  per 
run,  and  1000  entities  per  replication.  The  data  structure 
IS  shown  in  Figure  4,  where  the  first  column  of  4000  values 
represents  system  clock  time.  This  column  was  only  required 
for  the  Kalman  filtering  technique,  and  is  discussed  below. 
The  second  column  provides  the  observations  for  the  actual 
time  in  system  for  each  entity,  and  was  used  by  all  of  the 
techniques. 

Output  Analysis  Programming 

The  SLAM  II  model  output  consisted  of  400,000  data 
values.  As  can  be  imagined,  performing  the  batch  means  and 
independent  replications  techniques  manually  on  this 
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Figure  4.  Con f igurat ion  of  SLAM  II  Output 


quantity  of  data  is  untenable.  The  Output  Analysis  Program 
was,  therefore,  developed  to  process  the  data  quickly  on  a 
micro-computer.  The  program  was  written  in  Turbo  Pascal 
(2:1),  and  is  shown  along  with  its  corresponding  user's 
manual  in  Appendices  B  and  C. 

The  program  quickly  averages  the  data,  presenting  it  to 
the  analyst  in  a  graphic  display,  from  which  to  determine 
the  extent  of  the  transient.  Figure  5  is  a  wiring  diagram 
of  the  program,  indicating  the  relationships  between  each  of 
the  individual  procedures. 

Each  of  the  procedures  will  now  be  discussed  briefly. 
The  Data  Menu  allows  the  user  to  build  a  new  data  directory, 
recall  an  existing  directory,  or  list  all  directories  made 
previously  that  are  available.  Throughout  the  thesis,  the 
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Figure  5.  Flowchart  of  Output  Analysis  Program 

term  directory  will  be  used  to  denote  a  string  of  SLAM  II 
output  files  sequentially  listed.  The  data  directories  are 
built  by  reading  in  a  column  of  sequential  output  files  of 
uniform  length.  For  example,  in  the  case  of  the  thesis, 
four  output  files  of  length  1000  are  read  in.  The  user  is 
asked  to  provide  a  name  for  the  directory.  The  program  then 
takes  this  name,  adds  a  . DCT  extension,  and  saves  the  file 
to  disk  for  future  use.  The  recall  option  simply  retrieves 
one  of  these  previously  built  directories,  while  the  list 
option  provides  the  user  a  listing  of  .DCT  files  that 
currently  exist. 

Once  the  data  directory  has  been  built  or  recalled,  the 
user  is  asked  whether  to  evaluate  multiple  or  single  output 
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files. 


If  the  multiple  file  option  is  selected,  the  program 


averages  each  data  point  across  all  replications  and  plots 
these  averages  on  the  screen.  For  instance,  the  first 
observation  of  each  of  the  four  runs  would  be  added  together 
to  obtain  a  total,  this  total  divided  by  4  would  then  be 
plotted  at  X  =  1 .  This  procedure  was  repeated  1000  times. 

When  the  single  file  option  was  selected,  the  program 
used  the  technique  of  moving  averages  to  smooth  out  the 
data.  The  equations  and  methodology  for  this  technique  are 
covered  in  Chapter  II.  The  user  must  specify  the  moving 
average  window’s  halfwidth.  A  plot  of  the  averaged  data  is 
then  presented  on  the  screen. 

When  the  data  from  either  option  is  plotted  on  the 
screen,  the  user  has  the  ability  to  scroll  forwards  or 
backwards  through  the  data,  600  points  at  a  time.  The  user 
then  determines  an  estimate  of  where  the  transient  period 
ends,  and  enters  this  value.  The  program  takes  this  in  as 
the  variable  CUTOFF.  An  example  of  the  final  screen  display 
is  shown  in  Figure  6. 

With  the  CUTOFF  value  identified,  the  program  goes 
through  each  of  the  output  files  in  the  directory  and 
truncates  the  first  CUTOFF  number  of  values.  The  user  is 
asked  to  name  the  new  file  containing  only  the  steady  state 
data.  The  name  specified  is  given  a  . SST  extension  by  the 
program  and  is  saved  to  the  disk. 
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Figure  6.  Output  Analysis  Program  Graph  of  Data 


The  user  is  given  the  option  to  perform  batch  means  for 
a  single  file,  independent  replications  for  multiple  files, 
or  return  to  the  original  data  menu.  The  batch  means 
procedure  uses  Fishman’s  technique  (8:514),  while  the 
independent  replications  procedure  follows  Welch’s  equations 
(27:294-296),  both  of  which  are  discussed  in  Chapter  II. 

The  program  then  provides  the  mean,  its  associated  variance, 
and  confidence  interval  data  as  shown  in  Figures  7  and  8. 

The  major  advantage  of  this  program  is  that  it  provides 
the  user  the  ability  to  assess  multiple  data  files,  and 
instantly  see  the  impact  of  selecting  different  cutoff 
values.  This  makes  sensitivity  analysis  very  feasible,  with 
very  little  additional  effort  required. 
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BATCH  MEANS 


INDEP  WITH  25  BATCHES 

MEAN:  12.45 

VARIANCE:  146.83 

95%  CONFIDENCE  INTERVAL  : 
-11.79  <=  MEAN  <=  36.68 


Figure  7.  Batch  Means  Output  Analysis  Data 


INDEPENDENT  REPLICATIONS 


95%  CONFIDENCE  INT  COMPUTED  WITH  T=  2.7 

MEAN  :  12.17 

VARIANCE  :  1.28 

10.60  <=  MEAN  <=  13.74 

ADDITIONAL  RUNS  REQ  FOR  +/-  5%  INTERVAL  =  1578 


Figure  8.  Independent  Replications  Output  Analysis  Data 
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Application  of  Kalman  filtering 

The  key  to  implementing  Kalman  filtering  is  to  specify 
the  characteristics  of  the  system  which  is  to  be  modelled 
correctly.  To  accomplish  this,  the  software  package 
FORECAST  MASTER  (9:7-1  -  7-13),  and  in  particular,  the  state 
space  modelling  option,  was  employed.  FORECAST  MASTER  uses 
the  Goodr ich-Lar imore  algorithm  (9:7-8)  to  fit  the 
appropriate  ARMA  model  to  the  observed  data. 

This  ARMA  model  was  then  converted  into  the  appropriate 
state  space  representation  for  Kalman  filtering  (9:7-8  -  7- 
11;  11:101;  25:92).  The  user's  manual  addresses  the 

differences  in  their  algorithm  from  the  Akaike-Mehra 
algorithm,  referencing  Mehra's  work  on  the  single  adaptive 
filter  (18:175).  The  program  provides  the  state  transition 
matrix  G,,  the  Kalman  gain  matrix  CG,K],  and  the  measurement 
(forecast)  matrix  F,  of  Eqs  (36)  and  (37),  referenced  in 
Chapter  I  I . 

In  evaluating  the  Kalman  filter  performance,  only  the 
first  1000  data  values  from  each  of  the  50  files  (see  Figure 
4)  were  used.  These  are  exactly  the  same  data  values  used 
to  estimate  the  mean  with  the  batch  means  technique. 

The  one  stipulation  to  use  the  adaptive  Kalman  filter 
was  that  the  data  had  to  be  equally  spaced  in  time.  It  was, 
therefore,  interpolated.  The  two  techniques  attempted  were 
a  cubic  spline  and  linear  interpolation  between  successive 
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points.  Both  of  these  techniques  are  addressed  in  detail  in 
the  data  formatting  section  below. 

Once  the  equally  spaced  data  was  obtained,  the 
appropriate  state  space  model,  and  Kalman  filter  matrices 
had  to  be  computed.  Because  the  filter’s  matrices  will 
become  constant  when  the  system  has  reached  steady  state, 
observing  the  changes  within  the  Kalman  gain  matrix  provides 
information  on  when  steady  state  has  been  achieved.  At  this 
point,  FORECAST  MASTER’S  data  limitation  of  2000 
observations  was  encountered. 

In  order  to  evaluate  the  filter,  data  sets  of  2000 
observations  were  generated  in  several  ways.  The  first  data 
set  formed,  consisted  of  the  last  200  observations  from  the 
first  ten  output  files.  The  second  set  contained  the  last 
100  observations  of  the  first  20  files.  The  third  set  was 
comprised  of  the  last  50  observations  of  40  files,  while  the 
fourth  set  contained  the  last  40  observations  of  all  50 
files.  The  computer  program  used  to  obtain  these  values  is 
shown  in  Appendix  D.  Each  of  these  four  data  sets  was  then 
evaluated  by  FORECAST  MASTER,  for  autocorrelation  and  to 
determine  the  G,,  C6,KD,  and  F,  matrices. 

In  deciding  the  appropriate  order  for  the  ARMA  model, 
and  therefore,  the  Kalman  filter,  the  data’s  autocorrelation 
was  assessed.  Figure  9,  suggests  a  second  order  Markov 
process  (1:206;  16:183),  which  indicates  that  the 
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appropriate  state  space  model  for  the  data  should  have  at 


least  two  states.  Evaluation  of  the  Akaike  Identification 


Criteria  (9:6-4)  indicated  that  the  second  order  model 


provided  the  best  "fit"  of  the  data. 


1.0^ 
0.9  j. 
0.8  . 
0.7  . 
0.6  . 
0.5  . 
0.4  . 
0.3  . 
0.2  - 
0.1  . 

0  _ 
-0.1  . 
-0.2  - 
-0.3  . 
-0,4  . 
-0.5  . 
-O.G  . 
-0.7  . 
-0.8  - 
-0.9  - 
-1.0  - 


Figure  9.  Autocorrelation  Plot  of  Equally  Spaced  Output,  1=10 


In  examining  the  Kalman  filter  matrices  generated,  the 
values  for  the  last  two  sets  of  data  were  effectively  the 
same.  This  indicated  that  the  best  possible  model  for  the 
data  had  been  achieved.  The  resultant  Kalman  filter 


matrices,  obtained  from  FORECAST  MASTER,  are  shown  in  Figure 


10,  in  a  sample  output  format  from  the  program,  where  the 
program's  notation  differs  from  Meinhold  and  Singpurwalla 
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(19:123)  referenced  in  Chapter  II.  In  the  program  output 
the  r  matrix  represents  the  transition  matrix  (G,),  the  K 
matrix  remains  the  Kalman  gain  matrix  K  (G,K],  and  the  H 
matrix  represents  the  measurement  (forecast)  matrix  (F,). 
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Forecast  matrix  H  jj 

1 . 0000  0  4  0000  'i 


Figure  10.  Kalman  Filter  Values  from  FORECAST  MASTER 

The  state  space  modelling  option  also  provided  a  screen  with 
statistical  information  on  the  data  set  used  for  the  model 
(9:7-4),  shown  in  Figure  11.  The  information  provided 
gives  insight  into  the  basic  characteristics  of  the  data 
which  was  modelled.  The  fist  two  lines  indicate  the  average 
autocorrelation  between  each  of  the  residual  values  and 
residuals  which  occurred  a  "lag"  of  1  to  12  before.  The 
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AUTOCORRELATIONS  OF  LAGGED  RESIDUAL  ERRORS 


Lag : 


123456789  10  11 


K50 


.00  .01  -.00  .03  -.00  .02  -.09  -.05  -.08  -.04  -.04  -.00 
-.00  .01  .02  -.01  -.02  .01 


Statistic 

Number  of  observations 
Mean  value  of  K50 
Standard  deviation  of  K50 
Standard  error  of  forecast 
R-square  (corrected  for  mean) 
F(4,1994) 

Adjusted  R-square 
Ljung-Box  Test  =  Chisq(14) 
Durbin-Watson  Statistic 
Standardized  AIC  Statistic 
Standardized  BIC  Statistic 


Value  Probability 

1998 

11.162600 
10.331002 
6.435969 
0.612484 
787.897226 
0.611706 
45.811991 
1.996797 
0.623600 
0.627105 


1.000000 

0.999970 

0.056935 


Figure  11.  State  Space  Modelling  Statistics 

best  way  to  describe  the  concept  of  lag  is  with  an  example. 
Residual  4  is  "lag  1"  from  residual  5,  while  it  is  also  "lag 
3"  from  residual  7.  The  intent  of  reviewing  these 
autocorrelations  is  to  ensure  that  after  the  data  has  been 
modelled,  that  the  residuals  do  reflect  a  "white  noise" 
process  with  little  to  no  correlation.  The  table  gives  the 
number  of  observations  used  was  1998,  of  2000  available. 

This  was  due  to  the  fact  that  the  second  order  model 
required  two  previous  values  to  make  the  next  prediction,  so 
the  first  two  data  values  could  not  have  predictions.  The 
next  two  entries  provide  the  mean  of  the  data  values 
provided  and  the  standard  deviation  of  that  mean.  The 
standard  error  of  the  forecast  is  particularly  important, 
representing  the  square  root  of  the  underlying  system 
variance  (R,)  shown  in  Equation  (37)  of  Chapter  II. 
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The  R-squa*^e,  F  value,  and  adjusted  R-square  all  apply  to 
the  adequacy  of  the  Kalman  filter  in  modelling  the  data, 
where  an  R-square  value  of  1.0  indicates  a  perfect  fit 

(21:423).  Both  the  I _ jung-Box  test  and  the  Durb  in-Ulat  son 

statistic  indicate  the  "degree  to  which  the  error  residuals 
over  the  historical  sample  adhere  to  the  hypothesis  that 
they  are  independently  normally  distributed,  with  zero  mean" 
(9:6-5).  The  Durb i n-Wat son  test  statistic  is  "the  ratio  of 
the  sum  of  squares  of  the  first  differences  of  the  errors, 
divided  by  the  sum  of  squares  of  the  errors"  (9:6-5).  As  an 
indication,  the  hypothesis  is  supported  more  as  this  value 
approaches  2.0.  The  LJung-Box  test  is  similar  but  reviews 
several  lag’s  autocorrelation,  rather  than  Just  the  first 
(9:6-6) . 

Data  Formatting 

The  batch  means  and  independent  replications  techniques 
required  only  the  entities’  actual  time  in  system,  while 
Kalman  filtering  required  these  values  along  with  the  time 
between  observations.  This  resulted  in  the  output  files, 
generated  by  the  SLAM  II  program,  being  operated  upon  at 
least  twice  to  get  the  necessary  formats.  The  data  files 
generated,  and  the  programs  used,  are  shown  in  Figure  12, 
and  described  below. 
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(LININT.PAS) 


Figure  12.  Flowchart  of  Data  Formatting 
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The  entries  within  each  of  the  blocks  on  the  flowchart 
indicate  the  file  names  given  to  the  data  files  generated. 
The  entries  within  the  parentheses  ()  indicate  the  programs 
used  to  develop  each  data  file.  Table  1  lists  all  of  the 
programs  above  and  the  purpose  of  their  output  files. 


PROGRAM  PURPOSE 

LININT.PAS  Generates  a  file  of  linearly  interpolated  data 

at  a  specified  sample  rate  from  an  existing 
data  file. 

SPEED. PAS  Produces  a  data  file  from  the  original  SLAM  II 

output  which  contains  only  the  time  in  system 
val ues . 

TRUNCEQ.PAS  Truncates  the  transient  portion  of  the  data 
from  the  data  file. 

C0NriLE2.PAS  Concatenates  portions  of  the  steady  state  data 
files  to  form  a  file  of  2000  values  to  be 
evaluated  by  FORECAST  MASTER. 

FORECAST  State  space  modelling  option  was  incorporated 

MASTER  to  generate  the  appropriate  Kalman  filter 

matrices  for  the  files  generated  with 
C0NFILE2.PAS  (9:1). 

MSOFE  Generates  the  Kalman  filter  next  step 

predictions,  based  on  the  matrices  from 
FORECAST  MASTER  and  the  equally  spaced  data 
from  LININT.PAS  (4:1). 

OUTPUT. PAS  The  Output  Analysis  Program  generates  the 
mean  and  its  associated  variance  of  the 
original  and  Kalman  filter  predicted  time  in 
system  values. 

MEAN. PAS  Calculates  the  grand  mean  and  its  associated 

variance  for  the  means  of  the  Kalman  filter 
predictions,  with  cutoff  values  from  0  to  25'/. 
at  57.  increments. 


Table  1.  Programs  Used  and  Their  Purposes 
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Evaluating  the  data  files  for  the  Kalman  filtering 
approach  demonstrated  that  the  cubic  spline  interpolation, 
from  the  software  package  MATRIX;,  (12:4-34),  was  not 
feasible  due  to  the  data’s  high  variability.  The  cubic 
spline  criteria  to  pass  through  every  data  point  with  a 
continuous  function  (3:118)  caused  the  interpolation  to 
greatly  exaggerate  the  data.  Where  the  original  data  varied 
from  approximately  1/1000  to  64  minutes,  the  interpolated 
data  ranged  from  -2700  to  6700  minutes.  At  this  point,  the 
linear  interpolation  method  was  employed.  Its  values  were 
restricted  to  the  range  of  the  actual  data,  as  the 
interpolated  values  were  drawn  from  a  line  connecting 
original,  sequential  data  points.  A  plot  of  an  original 
data  set  and  a  partial  plot  of  the  fitted  cubic  spline  are 
shown  in  Figures  13  and  14. 
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When  the  mean  values  for  the  equally  spaced  data  were 
compared  to  the  batch  means  for  the  corresponding  original 
output  data,  they  were  consistently  lower.  This  is  due  to 
the  dynamics  of  the  queueing  system.  The  requirement  for 
equally  spaced  data  automatically  biases  the  linearly 
interpolated  values  low.  When  the  queue  is  full,  leading  to 
higher  times  in  system,  then  the  entities  exit  at  an 
exponential  rate  of  15/hr,  the  service  rate.  When  the  queue 
is  empty,  the  time  between  observations  is  higher,  while  the 
total  time  in  system  for  the  entity  decreases.  This 
resulted  in  the  lower  time  in  system  entities  having  a 
larger  distance,  on  a  time  axis,  between  them.  This,  in 
turn,  meant  that  as  the  time  between  samples  decreased,  the 
percentage  of  time  that  these  lower  time  in  system  values 
were  measured  increased.  As  is  shown  in  Chapter  IV,  this 
caused  the  mean  to  be  biased  lower  as  the  interpolated 
sampling  rate  increased. 

Summar  y 

Now  that  the  data  formats  and  techniques  have  been 
discussed,  the  three  techniques  must  be  compared.  Chapter 
IV  presents  the  results  of  each  technique.  Comparisons  were 
accomplished  for  cutoff  values  of  0  to  25’/.  of  the  data, 
based  on  the  calculation  of  the  system’s  steady  state  mean, 
and  its  957.  confidence  interval  coverage  rate  of  the 
analytic  and  system  average  means.  Chapter  V  presents 
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conclusions  and  recommendations,  along  with  identifying  some 
of  the  problems  which  needed  to  be  resolved  in  the 
implementation  of  the  Kalman  filter  technique. 
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Results  and  Discussions 


W. 


Output  Analysis 

This  chapter  presents  the  results  of  the  batch  means, 
independent  replications,  and  Kalman  filtering  techniques  in 
assessing  the  system’s  steady  state  mean  and  variance. 
Comparisons  are  based  upon  computation  of  the  grand  mean, 
the  average  size  of  the  predicted  957.  confidence  interval 
halfwidth,  and  the  percentage  of  time  the  confidence 
intervals  for  each  run  covered  the  grand  and  analytic  means 
for  each  cutoff  value.  The  grand  mean  represents  the 
average  of  the  means  from  each  of  the  50  data  files,  where 
the  analytic  mean  is  the  steady  state  mean  computed  below. 

To  predict  the  system’s  steady  state  mean  analytically, 
equations  from  Introduction  to  Probability  Models,  by  Ross 
are  used  (24:306-314).  The  statistic  of  interest  was  the 
time  spent  in  the  system  by  each  individual  entity, 
represented  by  the  letter  W.  To  determine  this  value, 
required  the  two  variables  which  represent  the  system’s 
average  arrival  rate,  X,  and  average  service  rate,  p. 

The  arrival  rate  represents  the  average  number  of 
entities  which  enter  the  system  during  a  specific  time 
interval.  The  time  between  arrivals  for  this  research  was 
drawn  from  an  exponential  distribution  with  a  mean  of  6, 
resulting  in  an  average  arrival  rate  of  10  entities/hour. 
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The  service  rate  indicates  the  average  amount  of  time  that 
the  entity  must  spend  within  a  particular  service  activity. 
This  value  was  drawn  from  an  exponential  distribution  with  a 
mean  value  of  4,  making  the  average  service  rate  15 
ent i t i es/hour . 

Given  these  definitions  and  the  equations  from  Ross 
(24:306-314),  the  average  time  each  entity  spends  in  the 
system  can  be  determined  with  the  following  equations. 

Arrival  Rate  X  =  10/hr 

Service  Rate  p  =  15/hr 

U  =  l/(p  -  X  )  (47) 

W  =  l/(5/hr)  =  .2  hr  =  12  minutes 

The  analytic  mean  of  12  minutes  was  then  used  as  a  benchmark 
for  the  output  analysis  techniques. 

Batch  Means.  The  simulation  output  files  were  sorted 
to  obtain  only  the  first  1000  actual  time  in  system  (TIS) 
observations,  for  each  of  the  50  data  files.  The  SLAM  II 
output  data  format  is  shown  again  in  Figure  16. 

This  data  was  loaded  into  the  Output  Analysis  Program, 
which  graphed  the  moving  average  with  a  window  width  of  10, 
as  shown  in  Figure  17.  This  graph  was  then  evaluated  to 
determine  the  extent  of  the  transient. 
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Upon  evaluating  the  above  figure,  it  is  unclear  as  to 
where  exactly  the  transient  period  ends.  Because  of  this, 
various  cutoff  values  were  tried.  The  batch  means 
algorithm,  developed  by  Welch  (27:307)  and  described  in 
Chapter  II,  was  used  for  each  cutoff  value  to  obtain  the 
values  listed  in  Table  2. 

The  grand  mean  and  average  halfwidth  represent  the 
average  of  the  50-run’s  mean  and  halfwidth  values  for  each 
cutoff.  Cutoff  values  on  this  table  are  by  number  of 
observations,  while  coverage  rates  indicate  the  percentage 
of  the  50  confidence  intervals  for  each  cutoff  which  covered 
the  corresponding  grand  mean  and  analytic  mean  of  12. 


CUTOFF 

GRAND 

MEAN(S. D. ) 

AVG 

HALFWIDTH 

(MEAN) 

•/.  COVER 

(12) 

7.  COVER 

AVG  BATCH 
SIZE 

0 

12. 63( 1 . 3) 

2.47 

88 

98 

28.  67 

50 

12.55(1.4) 

3.05 

90 

96 

28.48 

100 

12.10(1.3) 

2.91 

94 

94 

28.80 

150 

12. 18(1.4) 

3.23 

94 

96 

29.  12 

200 

11.21(1.1) 

2.55 

100 

84 

22.08 

250 

11.35(1.1) 

2.59 

100 

92 

21.44 

Table  2.  Batch  Means  Performance  Results 
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Reviewing  the  values  contained  in  Table  2,  several 
interesting  facts  come  to  light.  First,  the  coverage  of  the 
grand  mean  increases  as  more  data  is  truncated,  while  the 
coverage  of  the  analytic  mean  decreases.  This  indicates 
that  the  process  is  still  operating  somewhere  in  the 
transient  phase  of  the  simulation,  with  a  bias  high  at  the 
beginning.  Second,  despite  the  loss  of  data,  the  last  two 
halfwidth  values  are  the  smallest.  This  can  be  attributed 
to  two  reasons:  decreasing  variance  as  the  system  warms  up, 
and  the  reduced  number  of  observations  per  batch  allowing 
for  more  batches,  and  thus  a  tighter  interval. 

Independent  replications.  The  independent 
replications  technique  requires  two  or  more  runs  of 
simulation  output.  In  this  research,  the  technique  was 
performed  with  four  independent  output  files  of  1000  data 
values  concatenated  together  (see  Figure  16). 

The  Output  Analysis  Program  evaluated  these  files  using 
the  equations  developed  by  Welch  (27:282)  as  presented  in 
Chapter  II.  As  in  the  case  of  batch  means,  the  extent  of 
the  transient  period  was  not  easily  distinguished.  This  can 
be  seen  in  Figure  18.  Again,  multiple  cutoff  values, 
ranging  from  0  to  250,  were  tried.  Overall  average  values 
are  shown,  for  each  cutoff  in  Table  3. 
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Figure  18.  Graph  of  Averaged  Replications  Data 


CUTOFF 

GRAND 

MEANCS. D. ) 

AVG 

HALFWIDTH 

(MEAN) 

•/.  COVER 

(  12) 

7.  COVER 

0 

12. 55( . 93) 

1 . 29 

68 

62 

50 

12.41(1.1) 

2.47 

72 

76 

100 

12.28(.96) 

1.68 

84 

86 

150 

12. 08 ( .97) 

1.50 

80 

78 

200 

11.46(.82) 

1.35 

78 

66 

250 

11. 16<.91) 

1  .  15 

70 

56 

Table  3.  Independent  Replications  Results 
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The  results  shown  in  Table  3  demonstrate  the  problems 


which  can  be  encountered  if  too  much  data  is  used  when  the 
system  is  still  in  a  transient  state.  The  large  amount  of 
data  caused  the  confidence  intervals  to  be  drawn  in  too 
tightly  about  the  estimated  means.  Because  the  system  was 
still  in  the  transient,  these  reduced  confidence  intervals 
were  centered  at  values  below  the  true  mean.  This  resulted 
in  the  coverage  rates,  of  both  the  grand  and  analytic  means, 
to  be  much  lower  than  those  for  batch  means,  indicating  that 
data  would  have  to  be  obtained  further  out  in  the  run  for 
independent  replications  to  be  used  effectively.  As  a  note 
of  interest,  if  the  true  mean  had  been  unknown,  the  interval 
would  have  been  thought  to  have  contained  the  true  mean  957. 
of  the  time,  based  on  the  calculations  presented  in  Equation 
(10),  and  the  average  time  in  system  would  have  been  biased 
1  ow . 

Figure  19  presents  a  graphical  depiction  of  the  mean 
values  obtained  by  both  batch  means  and  independent 
replications,  versus  the  analytic  steady  state  mean  of  12. 
This  figure  indicates  that  the  value  of  the  estimated  system 
mean  drops  as  the  cutoff  value  is  increased.  This  was 
directly  the  opposite  of  the  result  expected,  giving  two 
indications.  First,  the  system  was  biased  high  early  in  the 
transient,  due  to  a  large  initial  buildup  in  the  queue  at 
the  start  of  each  simulation.  Second,  the  values  with 


63 


00  150 

CUTOFF  VALUE 
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Tigurc  19.  Compar i &on  of  Estimated  and  Analytic  Means 

higher  cutoffs  had  mean  values  lower  than  the  analytic  mean, 
indicating  the  process  was  probably  still  operating  in  the 
transient.  These  two  facts  must  be  kept  in  mind  while 
evaluating  the  three  techniques. 

Kal man  Filter.  The  original  SLAM  II  output  data  was 
linearly  interpolated  at  intervals  of  10  (1=10)  and  then 
evaluated  using  FORECAST  MASTER  (9:7-8)  to  obtain  the  Kalman 
filter  data  shown  in  Figure  20. 

These  matrices  and  the  data  were  then  run  through  the 
computer  program  Multimode  Simulation  for  Optimal  Filter 
Eval uat ion  (MSOFE)  (4:1).  Forecasts  generated  were 
evaluated  by  the  batch  means  portion  of  the  output  analysis 
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Figure  20.  FORECAST  MASTER  Output  on  the  Kalman  Filter,  1=10 

program  to  obtain  mean  and  957.  confidence  level  halfwidth 
information  for  each  cutoff  value.  Cutoff  intervals  were 
scaled  57.  of  the  observat  ions,  and  performed  from  0  to  257. 
of  the  data  to  evaluate  their  effect.  For  a  data  set  of 
1000,  ":his  would  be  50,  for  500  it  is  25,  and,  as  will  be 
used  later,  for  1300  it  is  65.  Table  4  displays  the  grand 
mean  (standard  deviation)  and  average  957.  halfwidth  values, 
by  cutoff,  for  data  interpolated  at  each  tenth  minute.  To 
calculate  the  mean  and  halfwidth  values  for  the  Kalman 
filter  predictions,  the  batch  means  portion  of  the  Output 
Analysis  Program  was  employed.  The  coverage  values  reflect 
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the  percent  of  the  confidence  intervals  which  cover  the 


gr  and 

mean  and  the 

anal y t i c  mean 

of  12. 

CUTOFF 

GRAND 
MEANCS. D. ) 

AVG 

HALFUIDTH 

(MEAN) 

•/.  COVER 

(12) 

7.  COVER 

♦»  PER 
BATCH 

0 

11.23(1.2) 

2.74 

88 

84 

15.4 

25 

11.18(1.3) 

2.96 

92 

80 

16.5 

50 

10. 87(1.2) 

2.83 

90 

78 

15.0 

75 

10.55(1.2) 

2.80 

90 

69 

13.  9 

100 

10.42(1 . 0) 

2.87 

90 

76 

15.0 

125 

9. 63 ( .99) 

2.41 

100 

52 

11.8 

Table  4.  Kalman  Filter  Performance  Results 

Again,  as  in  batch  means,  the  number  of  observations 
required  per  batch  drops  as  the  cutoff  value  increases. 

This  explains  the  tightest  confidence  interval  corresponding 
to  the  largest  cutoff,  even  though  less  total  data  was  used. 
The  coverage  rates  of  the  grand  mean  were  comparable  to 
those  for  the  batch  means  results  on  the  original  data.  The 
coverage  rates  for  the  analytic  mean,  however,  were  much 
lower.  This  was  because  the  value  for  the  Kalman  filter 
estimate  of  the  mean  was  lower  than  those  for  either  batch 
means  or  independent  replications. 

Review  of  the  equally  spaced  data  showed  that  after 
linear  interpolation  the  overall  data  mean  was  lower  than 
originally.  The  reason  appeared  to  be  that  the  sampling 
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rate,  6/hr,  was  so  much  smaller  than  the  service  rate  of 
15/hr.  Normally,  this  would  imply  that  the  sampling  rate 
should  be  increased.  This,  however,  assumes  that  samples 
are  taken  directly  from  the  process,  and  not  interpolated 
from  existing  data.  The  effect  of  increasing  the  sampling 
rate  was  then  investigated. 

The  SLAM  II  output  was  once  more  interpolated;  this 
time  with  a  time  between  observations  of  4  minutes.  The 
interpolated  values  were  then  evaluated  by  FORECAST  MASTER 
to  obtain  the  Kalman  filter  data  shown  in  Figure  21. 
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Figure  21.  FORECAST  MASTER  Output  on  Kalman  filter,  1=4 
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GRAND 

AVG 

(MEAN) 

(12) 

#  PER 

CUTOFF 

MEANCS. D. ) 

HALFWIDTH  V. 

COVER  : 

/.  COVER 

BATCH 

0 

9. 26( . 87) 

3.83 

97 

73 

72.7 

65 

9.31(1.1) 

4.  15 

97 

73 

85.  6 

130 

8. 51 ( . 99) 

3.82 

100 

58 

75.4 

195 

8.45(1.0) 

4.22 

100 

55 

86.6 

260 

7.71 (.98) 

3.91 

100 

41 

78.  7 

325 

6.76(.80) 

3.08 

100 

10 

50.  1 

Table  5. 

Kalman  Filter 

Resul ts. 

1=4 

Evaluating  in  the  same  manner  as  for  the  interval  of  10 
resulted  in  the  data  shown  in  Table  5,  which  shows  the  lower 
means  and  coverage  rates  as  a  result  of  increasing  the 
sampling  rate,  using  linear  interpolation.  The  two  major 
impacts  come  from  the  reduction  of  the  mean  and  the  size  of 
the  batches  required  to  pass  the  independence  test. 

The  decrease  in  the  mean  value  is  due  to  the  equal 
weighting  of  the  interpolated  values.  The  problem  arose 
because  entities  which  had  to  wait  in  the  queue  exited  the 
system  approximately  every  4  minutes,  matching  the  service 
rate  exactly.  Those  entities  that  didn’t  have  to  wait 
exited,  on  the  average,  with  a  higher  time  between 
observations  and  lower  times  in  system.  Because  the  time 
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between  observations  for  the  lower  "time  in  system"  values 
was  higher,  the  equally  spaced  observations  landed  on  the 
lines  connecting  these  lower  points  more  often.  Histograms 
are  shown,  for  one  of  the  data  sets,  in  all  three  formats: 
the  original  SLAM  II  output,  linearly  interpolated  for  1=10, 
at)d  linearly  interpolated  for  1=4-  These  are  shown  in 
Figures  22,  23,  and  24  respectively,  where  the  time  axis 
remains  the  same,  but  the  frequency  scale  changes. 
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Figure  22.  Histogram  of  Original  Output  Distribution 
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Figure  23.  Histogram  of  1=10  Interpolated  Data  Distribution 
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Figure  24.  Histogram  of  1=4  Interpolated  Data  Distribution 
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The  histograms  reflect  the  increased  weighting  of  the 
lower  time  in  system  observations  with  increased  sampling. 
Note  that  the  frequency  value  of  150  on  the  original  data 
histogram  represents  15’/.  of  the  entire  population,  where  for 
the  1  =  10  data  it  is  30V.,  and  for  the  1=4  data  it  is 
approximately  11.5V..  This  does  not  affect  the 
interpretation,  however,  of  how  the  distribution  is  moving 
to  the  left,  indicating  that  the  average  of  the  observations 
decreased  as  the  sampling  rate  increased.  Appendix  E 
contains  a  two  dimensional  array  with  the  time  between 
observations  vs.  time  in  system. 

Now  that  the  decreasing  mean  has  been  explained,  the 
large  difference  in  the  batches  required  must  be  addressed. 
The  answer  again  lies  in  the  time  between  observations. 

When  the  interval  was  10,  the  data  was  autocor rel ated  as  a 
typical  second-order  Markov  process.  (16:183;  1:206)  This 

autocorrelation  is  shown,  in  a  plot  from  FORECAST  MASTER,  in 
Figure  25. 

When  the  time  between  interpolated  points  is  reduced  to 
four,  the  bias  of  the  higher  time  between  observations 
values  comes  into  play  once  more.  The  lower  time  interval 
causes  the  interpolated  data  to  be  highly  autoc or r el  at ed , 
masking  much  of  the  underlying  system’s  mechanics.  This 
autocorrelation  is  shown  in  Figure  26. 
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Figure  25.  Autocorrelation  of  1=10  Data 
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Figure  26.  Autocorrelation  of  1=4  Data 
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Although  no  units  are  displayed,  the  autocorrelations  in 
Figure  25  are  spaced  at  time  intervals  of  10  minutes,  while 
those  in  Figure  26  are  every  4  minutes.  A  direct  comparison 
of  the  lag  autocorrelations  is,  therefore,  impossible. 

With  the  sources  of  the  bias  identified,  the  means  and 
average  delta  between  the  original  and  interpolated  means 
were  computed.  These  are  shown  in  Tables  6  and  7,  where  the 
cutoff  values  represent  the  percentage  of  the  total 
observations  included  in  the  cutoff.  The  number  0, 
therefore,  reflects  the  cutoff  value  0  for  all  methods  while 
5(7.)  indicates  50,  25,  or  65,  depending  on  the  method. 


CUTOFF  Percentage 


i 

0 

5 

iO 

20 

25 

□RIG 

12.63 

12.55 

12.  10 

12.  18 

11.21 

11.35 

10 

11.23 

11.18 

10.87 

10.55 

10.42 

9.63 

4 

9.26 

9.31 

8.51 

8.45 

7.71 

6.76 

Table  6 . 

Comp  ar i son 

of  Mean 

Val ues 

I_ 

0 

5 

10 

15 

20 

25 

AVG 

10 

1 . 40 

1 . 32 

1.23 

1.63 

0.79  1 

.  72 

1 . 35 

4 

3.  37 

3.  24 

3.59 

3.  73 

3.50  4 

.  59 

3.67 

Table  7.  Differences  between  Mean  Values 
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The  low  coverage  rates  shown  in  Tables  4  and  5  are 
influenced  directly  by  the  interpolation  underestimating 
the  mean.  The  effect  on  the  coverage  rate  of  adding  the 
average  bias,  S  =  1.35  for  1=10  and  S  =  3.67  for  1=4,  to  the 
estimated  mean  values  is  shown  in  Table  8. 


cuTorr 

(4) 

MEAN 

Coverage  Rates 

(4) 

MEAN  +  S 

(10) 

MEAN 

(10) 

MEAN  +  S 

0 

73 

100 

84 

94 

5 

73 

100 

80 

94 

10 

58 

100 

78 

94 

15 

55 

100 

69 

90 

20 

41 

100 

76 

90 

25 

10 

87 

52 

82 

Table  8.  Coverage  Rates  with  Delta  Added  In 

The  dramatic  increase  in  percent  coverage  of  the 
analytic  mean  value  of  12  indicates  that  the  filter  is 
performing  well,  but  is  biased  by  the  process  by  which  the 
data  was  obtained. 

Calculation  of  System  Variance.  Currently,  insight  to 
the  underlying  system  variance  is  obtained  through  observing 
the  variance  of  the  estimated  mean  and  through  the  use  of 
pseudovalues,  in  a  technique  known  as  Jackknifing  (15:227; 
22:737).  The  Jackknifing  technique  effectively  computes 
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pseudovalues  which  represent  calculations  of  the  variance  as 
each  one  of  the  data  values  is  individually  removed  from  the 
data  file.  The  intent  is  to  remove  some  of  the  sensitivity 
of  the  standard  variance  calculations  to  departures  from 
normality,  giving  a  better  indication  of  the  true  system 
performance.  The  Kalman  filtering  output  from  FORECAST 
MASTER,  however,  allows  for  easy  computation  of  the  system 
var 1 anc  e . 

The  computation  is  possible  due  to  the  assumptions  that 
the  system  has  attained  steady  state,  and  that  the 
measurement  values  (time  in  system)  are  perfect,  in  that 
they  are  explicitly  known.  Given  these  facts,  the  standard 
error  of  the  forecasts  was  taken  from  the  output  data  shown 
in  Figure  20.  This  value  represents  the  best  estimate  of 
the  system  variance  before  the  next  observation  is  seen,  R,. 
It  will  be  used  along  with  the  matrices  from  Figure  10  to 
compute  the  true  system  variance  at  the  time  of  measurement, 
E,  using  Equation  (42). 

E  =  (R,)  -  K*F,*(R,)  (42) 

where  K  has  been  substituted  for  R,F,’  (V,  +  ^ »  the 

Kalman  gain  matrix.  Because  of  the  nature  of  the  F,  matrix, 
C1,0],  the  value  of  E,,,  the  variance  associated  with  the 
next  prediction,  can  be  computed  using  Equation  (43). 
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The  variance  in  Figure  27  begins  at  a  specified  level, 
dropping  straight  down  as  the  first  observation  is  taken. 
After  the  observation  the  variance  increases  asymptotically 
toward  the  initial  variance  until  the  next  observation  is 
taken.  This  process  continues  until  some  steady  state  is 
achieved,  where  the  top,  R,,  and  the  bottom,  E,,  of  the 
sawtooth  become  constant.  These  values  now  represent  the 
range  over  which  the  variance  of  the  system  can  travel. 

These  values  are  useful  for  comparing  the  variance  of  one 
system  to  another  to  assess  the  variability  of  the  process. 

In  summary,  each  of  the  techniques  provided  some 
insight  into  the  M/li/1  queue  which  was  modelled.  The 
independent  replications’  results  were  biased  low  by  the 
fact  that  the  process  was  still  within  the  transient,  while 
the  Kalman  filter  experienced  problems  with  the  way  in  which 
the  data  was  interpolated.  Recommendations  to  correct  the 
interpolation  errors  are  addressed  in  Chapter  V. 
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V.  Conclusions  and  Recommendations 


Cone lusions 

The  objectives  set  forth  for  this  thesis  were  the 
development  of  a  computer  program  to  automate  the  batch 
means  and  independent  replications  simulation  output 
analysis  techniques  and  the  evaluation  of  possible  benefits 
of  Kalman  filtering  in  the  same  area.  Both  of  these  goals 
were  achieved,  while  several  areas  for  future  research  were 
ident i f led. 

The  Output  Analysis  Program  provides  the  analyst  with 
the  ability  to  assess  the  system's  steady  state  mean  and  its 
associated  variance  from  the  simulation  output.  Designed 
for  implementation  on  Air  Force  standard  micro-computers,  it 
allows  the  analyst  to  perform  the  following  functions  from 
within  one  program. 

1.  Data  averaging  across  replications 

2.  Moving  averages  within  a  single  long  run 

3.  Transient  identification,  from  a  graph  of  the  data 

4.  Data  truncation  of  the  transient 

5.  Generation  of  a  file  for  steady  state  data 

6.  Batch  means  evaluation  of  a  single  run 

7.  Independent  replications  evaluation  for  multiple 
runs 

With  these  capabilities,  the  analyst  can  also  instantly 

assess  the  impact  of  different  cutoff  values. 
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The  data  used  to  evaluate  the  simulation  output 
included  the  system’s  warm-up  from  its  initial  state  of 
empty  and  idle.  This  was  done  purposely  to  reflect  real 
world  conditions  and  assess  the  effects  of  the  transient. 
This  approach  also  reduced  the  length  of  the  SLAM  II  runs 
required,  keeping  system  costs  lower. 

In  evaluating  the  output  with  the  batch  means  and 
independent  replications  techniques,  it  was  apparent  that 
the  latter’s  confidence  intervals  were  too  small  to  obtain 
good  coverage  of  the  mean  value.  This  was  due  to  the  higher 
number  of  observations,  which  caused  the  confidence 
interval’s  width  to  become  too  tight  around  the  biased  mean. 

Batch  means  provided  better  coverage  of  both  the  grand 
mean  of  the  data  and  the  analytic  mean  of  12,  but  these 
rates  dropped  as  the  transient  cutoff  value  increased.  This 
indicated  that  the  system  was  still  operating  within  some 
portion  of  the  transient,  with  an  initial  high  bias  caused 
by  the  original  filling  of  the  queue. 

Employing  the  Kalman  filter  resulted  in  a  number  of 
interesting  findings.  The  requirement  for  equally  spaced 
data,  for  the  particular  adaptation  technique  used  to 
generate  the  adaptive  Kalman  filter,  led  to  the  mean  being 
biased  low.  Elimination  of  the  average  bias  produced  957. 
confidence  interval  coverage  rates,  of  the  analytic  mean. 
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which  were  superior  to  both  batch  means  and  independent 
rep 1 i c  at i ons . 

The  greatest  benefit,  however,  comes  in  the 
identification  of  the  systems's  true  underlying  variance. 
Because  of  the  nature  of  the  Kalman  filter,  the  variance  of 
the  forecasts  is  a  constant  value  once  steady  state  has  been 
attained.  Using  this  value,  along  with  the  Kalman  gain  and 
state  transition  matrices,  explained  in  Chapter  II,  the 
system's  variance  can  be  computed.  This  is  accomplished 
with  equations  from  the  update  cycle  of  the  usual  form  or 
the  innovations  form  of  the  Kalman  filter,  as  described  in 
Chapter  II  (18: 176) . 

Another  aspect  of  the  Kalman  filter  which  needs  to  be 
addressed  deals  with  its  primary  purpose.  It  provides  an 
excellent  "step  ahead"  forecast  for  the  next  observation's 
characteristics,  and  this  capability  is  currently  available 
in  software  packages  such  as  FORECAST  MASTER. 

Areas  for  Future  Development 

Several  areas  have  come  to  light  during  the  course  of 
this  thesis  which  would  be  interesting  to  follow-on 
research.  They  deal  primarily  with  future  research 
possibilities  for  the  Kalman  filter,  and  mainframe 
adaptation  for  the  Output  Analysis  Program. 

Harrison  and  Stevens  (10:341-362)  used  Bayesian 
estimation  to  compute  the  probability  that  each  new 


80 


observation  is  either  from  steady  state,  an  outlier,  a  step 
change,  or  a  slope  change.  The  process  effectively  used  an 
array  of  Kalman  filters  to  compute  these  probabilities, 
assuming  the  system  variance  was  known  in  advance. 

Extension  of  this  theory,  known  as  multiple  model  adaptive 
estimation  (17:129,131),  however,  utilizing  a  bank  of 
adaptive  Kalma{i  filters,  could  be  directly  applied  to 
identifying  the  end  of  the  simulation  transient  period. 

□nee  the  system  has  reached  steady  state,  to  within  a 
specified  probability,  the  covariance  kernel  can  be  computed 
and  estimation  of  the  mean  can  be  accomplished. 

The  requirement  for  equally  spaced  data  caused  many 
bias  problems  in  the  interpolated  data.  Three  approaches  to 
attempt  removing  this  bias  are  presented.  First,  further 
research  could  address  the  impact  of  using  the  original  data 
values  directly  from  SLAM  II,  and  what  impact,  if  any,  this 
has  on  the  estimate  of  the  system’s  mean. 

The  second  technique  is  an  extension  of  the  concept  of 
"moving  average,"  presented  in  Chapter  II  (27:293).  The 
problem  with  the  current  interpolation  technique  is  that  the 
interpolateu  points  sample  more  often  across  the  lower  time- 
in-system,  higher  t ime-between-obser vat  ion  data  points,  and 
each  is  considered  a  valid  data  point.  The  impact  of  these 
points  could  be  reduced  by  simply  averaging  the  first  five 
actual  data  values  on  each  side  of  the  interpolated  value. 
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This  parallels  the  moving  average  technique  and  should 
result  in  a  much  more  accurate  representation  of  the  data. 

The  third  approach  deals  with  the  exponential 
representation  of  the  transition  matrix,  F,  =  e*'*  (23:68). 
Given  this  relationship,  identification  of  the  underlying 
matrix  A  would  be  represented  as: 

A<yt  =  In  (F)  (48) 

which  represents  an  infinite  Taylor's  series. 

Once  the  A  matrix  is  determined  it  will  allow  for  non-fixed 
sample  rates  of  the  data,  equating  directly  to  the  (St  values 
between  the  actual  observations.  Individual  F  matrices,  F^, 
could  then  be  generated  for  each  of  the  observations  using 
the  equat ion : 


(A(St.) 

F^  =  e  (49) 

Another  area  which  came  to  light  was  to  use  of  the 
Kalman  predicted  values,  along  with  the  original  data,  to 
form  a  tighter  confidence  interval  on  the  estimator  of  the 
mean.  In  this  way,  the  Kalman  filter  predictions  would  be 
treated  as  a  second,  and  independent,  representation  of  the 
system's  steady  state  mean  value. 

Limitations  were  encountered  in  the  upper  bound  of  2000 
observations  for  FORECAST  MASTER.  Development  of  a  program 
which  incorporates  adaptive  filtering  techniques  for  larger 
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files  would  be  beneficial  in  determining  filter  values 
across  larger  data  sets. 

Along  the  same  lines,  the  Output  Analysis  Program  has 
certain  file  size  limits,  as  it  was  designed  for  a  micro¬ 
computer.  The  program  could  be  converted  to  Pascal  and 
placed  on  a  mainframe  computer,  to  alleviate  these 
1 i m 1 1  at  ions . 
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Appendix  A:  SLAM  1 1  and  FORTRAN  Code  for  M/M/ 1  Queue 

SLAM  II  Code: 

GEN, CART  PORTER, MM  1 QUEUE, 1/2/89,5, , , , , , ,72; 

LIMITS, 2, 3, 500; 

NETWORK 

CREATE, EXPONCS, 1),0, 1, 1000; 

QUEUE ( 1 ) ; 

ACT(1)/1,EXP0N(4,2); 

EVENT,  1 ; 

ENDNETWORK; 


FORTRAN  CODE; 


PR0GR/2M  MAIN 
DIMENSION  NSET(IOOOO) 

COMMON/SCOMl /ATRIB( 100) , DD< 100) , DDL( 100) , DTNOUl, 1 1 , MFA, MSTOP, NCLNR 
1 ,  NCRDR,  NPRNT,  tvmJN,  NN3ET ,  NTfFE,  SS<  100) ,  SSL  ( 100) ,  TNEXT ,  TMDW,  XX  ( 100) 

COmON  QSET(IOOOO) 

EaUIV«_ENCE<NSET<  1 ) ,  QSET<  1 ) ) 

NNSET=10000 

NCRDR=5 

^PR^^^=& 

NTAPE=7 

OPEN(10,FILE=’MOrvE.OUT',STATUS=’OLD’ ) 

C«_L  SLA^ 

STOP 

END 

C 

SUBROUTINE  EVENTCI) 

COMMON/SCOMl /ATRIB <  100) ,  DD ( 100) ,  DDL ( 100) ,  DTNOW,  1 1 ,  NFA,  MSTOP,  NCLNR 
1 ,  NCRDR,  NPRNT,  NM?UN,  NfvGET,  NT«=E,SS(  100) ,  SSL  ( 100) ,  TNEXT,  TNOW,  XX  ( 100) 


C  COMMUTE  TINE  BETWEEN  ARRIVALS  AND  SEND  TO  FILE  10 
GOTO  <1)1 

1  XX(3)=TN0W-ATRIB(1) 

XX(2)=TlvDW-XX(l) 

WRITECIO, 10)  XX(3),XX(2) 


04 


FaRWT(2F12.7) 


XX(1)=TN0U) 

RETURN 

END 


OUTPUT  ANALYSIS  PROGRAM 


USER’S  MANUAL 


Oevoloped  by 
Captain  Charlea  H.  Porter 
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OUTPUT  ANALYSIS  PROGRAM 


Introduct ion : 


The  Output  Analysis  Program  was  developed  to  evaluate 
the  mean  and  variance  for  simulation  outputs  using  the 
techniques  of  independent  replications  and  batch  means.  The 
program  was  developed  to  operate  on  sequential  output  files 
of  uniform  length,  with  one  entry  per  line.  Data  must  be  in 
an  ASCII  format.  As  a  note,  computation  of  957.  confidence 
intervals  uses  a  t-value  rounded  off  to  the  nearest  1/lOth. 

There  are  four  main  topic  areas  within  the  program:  the 
Data  Menu,  the  Graph,  Batch  Means,  and  Independent 
Replications.  This  manual  will  address  each  individually, 
along  with  operating  instructions. 

Output  of  any  desired  screen  may  be  obtained  by 
pressing  <SHIFT>  Prt  Scr,  while  a  working  printer  is  on 
1  ine. 

Program  Operation: 

The  Output  Analysis  Program  is  menu  driven.  Selection 
of  options  on  menus  requires  only  the  number  of  the 
selection  to  be  pressed,  that  is  with  no  carriage  return. 
When  the  user  is  queried  for  file  names  and  cutoff  values  a 
carriage  return  is  required  to  end  the  entry.  File  names 
should  be  kept  short,  but  have  a  maximum  limit  of  16 
characters. 

Initialization: 

Data  for  the  Output  Analysis  Program  must  be  on  the 
same  computer  disk  drive,  and  in  the  same  computer  directory 
as  the  executing  file,  or  have  its  appropriate 
dr i ve\d irectory  identified.  For  instance,  if  the  program  is 
on  drive  A:  and  the  data  is  in  the  directory  DATA,  on  drive 
C: ,  in  a  file  named  OUTPUT.DAT,  then  when  queried  for  the 
file  name  the  user  must  respond  C:\DATA\OUTPUT.DAT.  If  the 
data  is  in  the  same  directory  as  the  program  it  is  necessary 
only  to  type  the  filename  and  extension  (OUTPUT.DAT).  To 
accomplish  this  either  the  executable  file  OUTPUT.EXE  can  be 
copied  onto  the  computer  hard  disk  (if  available)  along  with 
the  data  to  be  used,  or  OUTPUT.EXE  can  be  copied  onto  a 
floppy  diskette  along  with  the  data.  As  a  note  the  .EXE 
file  requires  less  than  60K  of  disk  space. 
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To  initialize  the  Output  Analysis  Program,  place  the 
program  diskette  into  a  5  1/4  inch  floppy  disk  drive,  log 
into  the  appropriate  drive  and  either  copy  the  file 
OUTPUT.EXE  to  the  hard  disk  or  to  a  floppy  containing  the 
data  file.  Now,  log  into  the  appropriate  drive  and  type 
"GRAPHICS."  This  will  enable  the  graphic  displays  within 
the  program  to  be  printed.  Then  type  "OUTPUT"  .  The 
program's  Data  Menu  will  now  appear  on  the  screen.  Pressing 
the  <shift>  key  and  the  Print  Screen  key  together  will  now 
pass  whatever  is  displayed  on  the  screen  to  the  printer. 

Data  Menu: 

The  Data  Menu  provides  the  user  with  four  options. 


1)  BUILD  NEW  DATA  DIRECTORY 

2)  RECALL  EXISTING  DIRECTORY 

3)  LIST  EXISTING  DIRECTORIES 

0)  EXIT 


1)  BUILD  NEW  DATA  DIRECTORY 


This  option  allows  the  user  to  construct  a  new  data 
file  from  a  simulation  output  file.  Again,  the  data  must  be 
in  sequential  files  of  equal  length.  The  user  is  queried 
for  the  name  of  the  new  directory  to  build,  the  number  of 
successive  outputs  files  to  be  read  in,  the  length  of  these 
output  files,  and  the  name  of  the  simulation  output  file. 

The  values  are  then  read  from  the  specified  simulation  file 
into  a  file  "F I LENAME . dc t " .  See  below  for  example. 

EXAMPLE  1.  (Computer  Queries  are  in  BOLD) 

SIM. OUT  (Consists  of  two  simulations  outputs  of  five  values 
each,  obtained  from  a  program  such  as  SLAM2) 

1 

2 

3 

4 

5 
2 

3 

4 
4 

6 
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NAME  OF  NEU  DIRECTORY:  EXAMPLE 


NO.  OF  FILES  IN  DIRECTORY:  2 
NO.  OF  ENTRIES  PER  FILE:  5 
NAME  OF  FILE:  SIM. OUT 
DIRECTORY  EXAMPLE. DCT  IS  NOW  BUILT 

At  this  point  the  data  has  been  loaded  into  a  .DCT  file 
and  can  be  evaluated  by  the  graph  options  listed  in  the  next 
sec  t i on . 

If  the  user  accidently  input  SUM. OUT  for  the  name  of 
the  simulation  output  data  file  (rather  than  SIM. OUT),  the 
program  would  return  the  appropriate  error  message  that  the 
file  did  not  exist.  It  then  provides  the  user  the  choice  of 
reentering  the  file  name  or  returning  to  the  Data  Menu. 

2)  RECALL  EXISTING  DIRECTORY 

This  option  is  used  if  the  data  file  required  was 
loaded  previously,  and  the  name  is  known.  The  user  will  be 
prompted  for  the  directory  name  (NOTE:  The  .DCT  extension  is 
assumed  and  not  entered).  The  program  will  then  provide  the 
user  with  the  number  of  files  and  number  of  entries  per  file 
for  the  directory  which  was  recalled.  The  user  then  has  the 
option  to  choose  one  of  the  two  graph  options  described  i 
the  GRAPH  section  below,  or  to  return  to  the  Data  Menu  an 
specify  a  different  file  for  use. 

If  a  file  is  requested  which  does  not  exist  then  the 
program  will  return  an  error  message  indicating  the  file 
could  not  be  found.  The  user  is  then  given  the  choice  of 
trying  to  load  a  file  again  or  returning  to  the  Data  Menu. 


3)  LIST  EXISTING  DIRECTORIES; 

Like  the  Recall  option,  this  option  allows  the  user  to 
retrieve  data  directories  previously  created.  When 
selected,  however,  it  provides  a  list  of  all  .DCT  files 
currently  on  disk.  The  user  is  then  prompted  to  either 
select  one  of  these  files  for  use  or  return  to  the  main  Data 
Menu.  If  the  user  wants  to  load  one  of  the  listed  files  the 
recall  routine  is  automatically  called  and  the  commands  are 
the  same. 
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a  D 


0)  EXIT: 


This  selection  terminates  the  Output  Analysis  Program 
and  returns  the  user  to  the  DOS  shell.  All  steady  state 
files  (discussed  in  the  Graph  Section)  and  . DCT  files 
generated  during  the  session  are  automatically  saved. 


GRAPH; 


The  Graph  portion  of  the  program  incorporates  a  number 
of  data  techniques  and  operations.  When  the  Data  portion  of 
the  program  terminates,  either  when  building  a  new  directory 
or  recalling  an  old  one,  the  user  will  have  the  following 
two  opt  ions. 


1 ) 

IDENTIFY 

TRANSIENT, 

MULTIPLE  FILES 

2) 

IDENTIFY 

TRANSIENT, 

SINGLE  FILE 

The  selection  criteria  are  self  explanatory, 
user  is  evaluating  multiple  runs,  planning  to  use 
Independent  Replications  option,  then  option  1  is 
If,  on  the  other  hand,  he/she  is  working  with  one 
for  Batch  Means,  then  option  2  is  selected. 


If  the 
the 

sel ec  ted . 
long  run. 


1)  IDENTIFY  TRANSIENT.  MULTIPLE  FILES; 

In  option  1,  the  multiple  runs  are  averaged  across  the 
individual  observation  numbers.  For  instance,  the  first 
value  of  file  one  is  averaged  with  the  first  value  of  file 
2,  and  file  3,  etc.  These  average  values  are  then  displayed 
on  the  computers  screen,  along  with  a  scale.  The  user  must 
then  interpret,  from  the  display,  the  point  at  which  the 
transient  period  ends.  The  user  is  allowed  to  scroll 
through  all  of  the  input  data  by  typing  a  "1"  to  back-up  one 
screen,  and  a  "2"  to  go  forward  one.  When  the  user  has 
determined  the  extent  of  the  transient  he/she  types  the 
number  3  and  is  queried  to  enter  the  appropriate  transient 
"cutoff"  value,  and  answer  yes  or  no  (Y/N)  to  having  a  copy 
of  the  current  graph  screen  printed.. 

2)  IDENTIFY  TRANSIENT.  SINGLE  FILE; 

In  option  2,  the  data  is  "smoothed"  by  running  a  moving 
average  across  the  data.  Reference  Peter  D.  Welch’s  work  in 
the  Computer  Perfomance  Modelling  Handbook  (27:293).  This 
is  accomplished  by  averaging  values  within  a  "window"  whose 
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width  is  specified  by  the  user  at  the  prompt  "SIZE  OF  MOVAVG 
UINDOU  and  plotting  this  average  as  the  value  at  the 

center  of  the  window.  This  window  moves  to  the  right,  one 
observation  at  a  time,  until  there  isn't  enough  data  to  fill 
the  window.  This  smoothed  data  is  then  graphed  on  the 
screen,  and  the  user  has  the  same  options  as  noted  above  to 
scroll  through  the  data.  After  the  "Cutoff"  value  is  input, 
and  the  user  has  answered  whether  he/she  would  like  a  print 
out,  the  cutoff  number  is  used  to  develop  a  file  of  steady 
state  values. 


STEADY  STATE  FILE  GENERATION: 

□nee  the  transient  "cutoff"  value  has  be  attained,  the 
program  edits  the  current  . DCT  file  and  truncates  the  first 
"cutoff"  number  of  entries  from  each  of  the  files.  The  user 
is  asked  for  the  name  of  the  new  Steady  State  File,  to  which 
the  program  adds  a  . SST  extension.  This  file  will  be  the 
one  used  to  compute  the  mean  and  variance  for  the  system 
while  in  steady  state. 

When  the  .SST  file  has  been  created,  the  program  will 
ask  the  user  to  select  either  the  Batch  Means  or  Independent 
Replications  techniques.  Selection  is  based  on  the  number 
of  files  within  the  .SST  file. 


BATCH  MEANS; 

The  Batch  Means  portion  of  the  program  evaluates  one 
long  run  of  the  simulation.  The  data  within  the  .SST  file 
is  partitioned  into  larger  and  larger  batches  until 
independence,  between  the  batches,  is  obtained.  The 
algorithm  used  was  developed  by  Fishman  (8:510-511).  When 
the  test  for  independence  passes,  the  number  of  batches 
required  for  independence,  the  estimated  mean  and  variance, 
and  the  95'/.  confidence  interval  (assuming  a  t-value  of  1.96) 
are  displayed  on  the  screen. 

If  independence  is  not  obtained  before  the  number  of 
batches  decreases  to  less  than  eight  (8),  then  the  message 
"TEST  FOR  INDEPENDENCE  FAILED,  NEED  MORE  DATA",  will  ;>ppear 
on  the  screen.  Typing  a  carriage  return  will  now  return  the 
user  to  the  initial  Data  Menu,  from  which  he/she  can 
accomplish  further  runs  or  exit. 
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Independent  Replications; 


The  Independent  Replications  procedure  also  uses  the 
truncated  . SST  data  file.  The  difference  is,  of  course, 
that  there  are  multiple  simulation  outputs  in  the  .SST  file 
used  by  the  Independent  Replications  procedure. 

The  procedure  automatically  calculates  the  95'/. 
confidence  interval  for  the  data,  using  the  algorithms  noted 
by  Welch  in  the  Computer  Performance  Modelling  Handbook 
(27:280).  The  mean,  variance,  confidence  interval,  and 
number  of  additional  runs  required  to  be  within  +/-  5'/.  of 
the  computed  mean  are  displayed  on  the  screen  for  the  user's 
information.  The  additional  runs  required  are  computed  with 
the  formula  identified  by  Welch,  for  Pilot  Experimentation 
(27:320) . 

After  the  data  are  displayed,  typing  carriage  return  will 
return  the  user  to  the  initial  data  menu,  to  accomplish 
further  runs  or  to  exit. 
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Appendix  C: 


Output  Anal  vs i s  Program 


{tM  63520,0, 65519>  (Sets  Heap  Size  to  Maximum  Allowable) 

PR0GR(=M  Sim_YSIS; 

{This  program  contains  the  graphic  lead  in  slide  for  the  Output 
Analysis  Program  and  a  call  to  the  unit  PROC,  uhich  contains  the 
procedures  necessary  to  do  the  calculations.  No  actual  computations 
or  data  activities  take  place  within  the  main  program) 

USES  CRT, DOS, PROC, GR^PH;  {SPECIFY  TT-E  UNITS  THE  PR0GR(=M  WILL  USE) 

PROCEDURE  LEADIN; 

{This  procedure  simply  produces  the  introductory  screen  for  the 
Output  Analysis  Program) 

VAR 

GD,GM,ERRCODE  : INTEGER; 

BEGIN 

READLN;  {QRPRHICS  LEAD  IN  SLIDE  GENERATION) 

GD:=DeTECT;  {FROM  EXAMPLE  IN  TURBO  PASCAL  REF.  GUIDE) 

INITGRAPH(GD,GM, ' ’ ); 

ERRCODE : =GRAPHRESULT ; 

IF  ERRCODE  =  GROK  THEN 
BEGIN 

SETTEXTSTYLE CTRIPLEXFONT, HORIZDIR, 5) ; 

SETBKCOLOR(BLUE) ; 

TEXTCOLQRC YELLOW) ; 

OUTTEXTXY (35, 100, 'OUTPUT  Atfv^YSIS  PROGRAM'); 

SETTEXTSTYLE  (DEFAULTFONT,  HORIZDIR,  1 ) ; 

OUTTEXTXY (400, 240, 'Developed  by'); 

OUTTEXTXYOO,  310, 'Press  <RETURN>  to  begin'); 
SETTEXTSTYLE(TRIPLEXFONT, HORIZDIR,  1 ) ; 

0UTTEXTXY(400,260, 'Capt  Charles  H.  Porter'); 

READLN; 

END 

ELSE  {IN  CASE  OF  G»^ICS  ERROR) 

WRITELN('GR«=HICS  ERROR  ;' ,GR«=>HERRORMSG( ERRCODE) ) ; 

CLOSEGRAPH; 

TEXTMODE  (C80); 

END; 

(*  MAIN  MEMJ  *) 

BEGIN 

CLRSCR* 

LEADIN’  {DISPLAYS  LEAD  IN  GRA^ICS  FROM  ABOVE) 

DATA(TENP,NUMFILE,NLJMENT);  {INITIAL  CALL  TO  UNIT  PROC) 

CLRSCR; 

END. 
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UNIT  PROC; 

{THIS  UNIT  IS  COMPRISED  OF  PLL  TKI  PROCEDURES  ivECESS^Y  TO  PERFORM  TVE 
OUTPUT  «sW_YSIS  TEO-NIQUES  OF  BATCH  t«E/2*vJS  INDEPENDENT  REPLICATIONS. 
IT  IS  ACCESSED  BY  THE  PROGRAM  SirVEYSIS  TO  PERFORM  THESE  FUNCTIONS} 


INTERFACE 

USES  CRT, DOS,  GRAPH, GRAPHS; 

TYPE  APA  =  ARRAY  CO.. 2000]  OF  REAL; 
WRI  =  STRINGC20]; 
m_E=  TEXT; 

VAR  TErP,TEM=P,MAVG 
NUrENT,NUMFILE 
PAUSE 

FILf^^V^,  DCTIvWIE 
FIL 

ORDER,  CUTOFF,  NUM 


PROCEDURE  PRTSC; 

PROCEDURE  BOXES; 

PROCEDURE  STSTATE; 

PROCEDURE  BATCH; 

PROCEDURE  SST; 

PROCEDURE  GRAPHIT; 

PROCEDURE  AVERAGE; 

PROCEDURE  MOVAVG; 

PRCDCEDURE  ^EWDATA; 

PROCEDURE  VIEW; 

PROCEDURE  DATA(TEIP>:ARA;VAR  NLmLE,NUr«ENT:  INTEGER) ; 


IM=LEr«ENTATION 

{THIS  PORTION  OF  THE  UNIT  DOES  TFE  ACTUAC  OPERATIONS  ON  TT-E 
PROCEDURES/FUNCTIONS  IDENTIFIED  IN  THE  INTERFAOE  SECTION) 


APA;  {GLOBAL  VAPIABLE  DECUVJATION) 

INTEGER; 

WRI; 

FYLE; 

INTEGER; 


{PROCEDURE  DECLARATION  IN  TT-E  INTERFACE  SECTION) 


94 


FUNCTION  EXIST(FILENi=»1E:STRING):BQC]L^W; 

{This  function  identifies  whether  a  file  actually  exists. 

It  is  used  called  Just  before  a  file  is  to  be  accessed. 

If  the  file  does  not  exist  the  user  is  given  the  opportunity 
to  try  again,  or  back  up.  If  this  function  were  not  included 
then  the  program  would  abort  (whenever  an  action  was  attempted 
on  a  non-existent  file> 

VAR 

FIL  iFIUI; 

BEGIN 

ASSIGNCFIL.FILElvWiE) ; 

{*!-> 

RESET  (FIL); 

{*!  +  } 

EXIST  :=  (lORESULT  =  0); 

END; 

PROCEDURE  PRTSC; 

(THIS  PRCCEDURE  PRINTS  DATA  CURRENTLY  DISPLAYED  ON  SCREEN.  IT  IS 
ACCESSED  BY  THE  GRAPHIT  PROCEDURE  TO  GET  A  "HPRD  COPY"  OF  THE 
ACTUAL  DATA  PLOT} 

REG: REGISTERS; 

BEGIN 

INTR($5, Dos. Registers(REG) ) 

END; 

PROCEDURE  BOXES; 

{This  procedure  draws  boxes  on  the  screen,  within  v«hich 
the  selection  rules  are  placed,  in  menu  format •> 

TYPE 

bar  =  stringC79]; 


var 

lyne  :  bar; 

UL, vl ,ur, 1 1 , Ir, 1 i,r i,mi;char ; 
X: INTEGER; 
begin 

TEXTCCLORCmiTE); 

UR:=CH^(187); 

LR:=CHAR(188); 

LL:=OWR(200); 

LI;=Cm?(185); 

RI:=a-WR(204); 

VL:=a-W?(106); 

UL:=OW(201); 

LYTC:  ='  - 

CLRSCR; 
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G0T0XY(12,5);  WRITECUL); 
G0T0XY(62,5);  WRITE(UR); 
G0T0XY(13,5);  WRITE (LY^E); 
G0TDXY(13,7);  WRITECLYNE); 
Q0T0XY(13, 18); WRITE (LYNE); 
QOTOX Y ( 1 3 , 20 ) ; WR I TE < LYTE ) ; 
FOR  X:=  6  TO  19  DO 
BEGIN 

G0T0XY(12,X);  WRITE(VL); 
G0T0XY(62,X);  WRITE(VL); 
END; 


G0T0XY(12,20) 
G0T0XY(62,20) 
G0T0XY(12, 18) 
Q0T0XY(62, 18) 
G0T0XY(12,7); 
GOTOXY(62,7); 
TEXTC:OLOR  (YELLOW); 
END; 


WRITE(LL); 
WRITE(LR); 
writeCr i ) ; 
writeCl i) ; 
write(ri); 
wr  ited  i) ; 


PROCEDURE  STSTATE; 


{This  procedure  evaluates  the  data  in  a  steady  state  directory 
vi/iich  was  developed  in  the  SST  procedure.  It  uses  the  technique  of 
independent  replications  and  calculates  the  confidence  interval  for  the 
data  provided.  A  T-value  of  1.96,  corresponding  to  a  957.  confidence 
interval  for  large  amounts  of  data,  will  be  used.) 

\/PR 

I, J,K,N 

MU, TOTAL, SQD IF 
MUmT,  LOW,  HIGH,  T 
SSQ,S,MUTOT,SUM,DIF 
PAUSE 
TVA_ 

NUTRUN, ADDRUN 

BEGIN 

CLRSCR; 

BOXES; 

TEXTCOLOR(YELL(DW) ; 

G0T0XY(25,6);  WRITEC 'STEADY  STATE  AfvWLYSIS’); 

M:=l;  {INITI«_IZES  COUNTER  FOR 

DATA  RETRIEVAL) 


UNI  tutK; 

ARA; 

RB^; 

REAL; 

CWR; 

STRINGC4); 

INTEGER: 


MLrT0T:=O; 

ZERO) 

MLJmT:=0; 


{INITI(=4_IZE  VARIABLES  TO 
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FOR  I:=l  TO  NUrriLE  DO 
T0T«_Cn;=0; 


FOR  I:=l  TO  MJ-FILE  DO  {CX1'1=UTE  MEPMS  OF  INDIVIDUPL 

F1LES> 

BEGIN 

FOR  J;=l  TO  NLM  DO 
BEGIN 

TOTPLC I : ;  =TOTPL.C  I I+TEMPPCMD; 

M:=M+1; 

END; 

MUC I  ] :  =TOTALC  n/NUM; 

t-IJTOT:=MLJTOT+MU:n;  {SUM  PU_  OF  THE  INDIVIDUAL 

MEANS) 


END; 

MUHAT:=MLJTOT/NLIM^ILE;  CCOTPUTE  THE  GRAND  HEP^J) 

SLJM:=0; 

FOR  K;=l  TO  NUmUE  DO  CCPLCULATE  THE  VPRK^sJCE  OF  THE 

DATA) 

BEGIN 

SQDIFCKI ; =SQR<MUCKI-MUHAT) ; 

SUM:=SUM+SaDIFCK]; 

END; 

SSQ: =SUM/ (NUMFILE-1 ) ; 

S;=SaRT(SSa);  {OBTAIN  THE  ST«vJDf«D 

DEVIATION) 


IF  NUmLE=l  THEN 
T-VALUE) 

T:^12.71; 

IF  NLJrFILE=2  THEN 
T:=4.303; 

IF  MJMFILE=3  THEN 
T;=3. 182; 

IF  NUMFILE=4  THEN 
T:=2.776; 

IF  NUMFILE=5  Tl^N 


T;=2.571; 

IF  (MJMFILE=6)  OR  <NLJMFIUE=7)  THEN 
T:=2.4; 

IF  (NLJ^FILE=8)  OR  <NLJ*FILE=9)  THEN 
T:=2.3; 

IF  (NLJMFILE>=10)  AND  (NLJrFILE<=13)  T»^ 


{SET  THE  «=PROPRIATE 
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T:=2.2; 

IF  <NLrFILE>=^14)  AND  (NLJMFILE<=27)  THEN 
T:=2. 1; 

IF  (NUmLE>=28)  fiND  (NLmLE<=120)  THEN 
T:=2.0; 

IF  NLrFILE>120  THEN 
T:=1.96; 


STR(T,TVAL); 

G0T0XY(17, 10);  U)RITE(’957.  CXlsFIDENCE  INT  COMMUTED  WITH  T=’,TV(2t.); 

LOW:  =  (MLI-WT-((T*S)/SQRT(NUMFILE)));  {COMPUTE  THE  957.  CQNF 

INT) 

HIGH:  =  (MUmT+(  (T*S) /SGRT  (NUHFILE) ) ) ; 

Q0TQXY(22, 12);  WRITEC'EST.  MEPN  :  '  ,MLJH*^T:6:2) ;  (DISPLAY 

SOLUTIONS} 

G0T0XY(22, 13);  WR I TE (’ VARIANCE  ;  ’,SSQ:6:2); 

Q0TaXY(21, 15);  WRITE(LaW:9:2, '  <=  MEAN  <=  ’,HIGH:6;2); 

(The  variable  ADDRUN  represents  the  number  of  additional  runs  of 
the  simulation  required  to  obtain  a  confidence  interval  which 
IS  within  +/-  5"/.  of  the  mean.  Reference  Welch,  in  Chapter  6 
of  the  Computer  Performance  Modeling  Handbook,  pages  321-322) 


DIF: = (HIGH-LOW); 

NUMFUN;  =TRUNC ( (DIF/ (MUHAT/ 10) ) »^lJ^1B^JT) ; 

ADDRUN:  N^U'RU^H^^IUMENT; 

IF  ADDRUN<0  THEN 
ADDRUN: =0; 

Q0T0XY(14, 17);WRITE(’ADDITiav^  RUNS  REQ  FOR  +/-  57.  INTERVAL  = 
ADDRUN); 

PAUSE: =READKEY; 

END; 


PROCEDURE  BATCH; 

{This  procedure  utilizes  Fishman's  algorithm  for  computing  independence 
between  the  batches.  It  starts  with  a  batch  size  of  one,  and  then 
doubles  it  until  either  independence  is  obtained  or  the  number  of 
batches 

decreases  to  less  than  8.  Specifics  on  the  algorithm  can  be  found  in 
Fishman's  article,  "Grouping  Observations  in  Digital  Simulation", 
Management  Science, 24, 5, 1978,  pp  510-521.  The  intent  of  the  algorithm 
is  to  compute  system  mean  and  variance  from  one  long  run.) 
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BREAK 

LONGINT 

D,  1,  J,M,N 

INTEGER: 

A,B,VARC,MTOT,K,Z 

REAL; 

TTOP,TBOTT,MEAN,T 

REAL; 

TOT, YAVG, CM, LOW, HIGH 

REAL; 

MU 

fiPA; 

SUMSQR,  lEA^EQR,  SKSQ 

REAL; 

SK,HW 

REAL; 

(NUMBER  or  BATOES} 


BEGIN 

IF  NUhFILE-l  TIEN  {IF  THE  FILE  IS  APPROPRIATE  PROCEED} 

BEGIN 

TaT:=0;  {INITIALIZE  VARIABLES) 

Z:=0; 

CLRSCR; 

BOXES; 

TEXTCOLORCYELLOUI) ; 

GOTOXYC  28, 6);  WRITE  (’BATCH  MEA^E’ ) ; 

FOR  I;=l  TO  MJM  DO  {SUM  TVE  STEADY  STATE  DATA) 

TaT:=TOT+TEMPPCn; 


YAVG: =TOT/NUM;  {COMPUTE  THE  AVERAGE  OBS 

VALUE) 

MT0T;=0; 

D:  =0; 

K;=NUM; 

N;  =  l; 


WHILE  K  >=  8  DO  {WHILE  NO.  OF  BATCHES 

RETAINS  >8) 

BEGIN  {PERFORM  THE  BATCH  rEA^E 

TEST) 

TT0P:=0; 

TB0TT:=O; 

J:=l; 

Z;=1.96; 

BREAK:  =TRUNC<NUM/N);  {COtEUTE  NUMBER  OF  BATCHES) 

FOR  I:=l  TO  BREAK  DO  {FOR  EACH  BATCH) 

BEGIN 

TOT: =0; 

FOR  M;=l  TO  N  DO  {SUM  THE  NUMBER  OF 

ENTITIES/BATCH) 
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{CCmJTE  EACH  BATCH’S  rt(=M> 


BEGIN 

TCrr:=T0T+TEM=PCJ3; 

J:=J+1; 

END; 

MJCI3:=TQT/N; 

END; 

FOR  I:=l  TO  CBREA<-1)  DG  CCOMFUTE  THE  TEST 

STATISTICS} 

BEGIN 

A:  =  ( (MU:  I  :-MUC  I  +  n )* (MUC 1 3-MUL  I  +  l : ) ) ; 

TTOP: =TTaP+A; 

B: = (MJL I 3-YAVG)* (MUC I D-YAVG) ; 

TBOTT: =TBQTT+B; 

END; 

TBOTT :=TBOTT+(  (MUCBRB=*<D-YAVG)*(MUCBREA<:-YAVG) ) ; 

CM;=  1-(TT0P/(2*TB0TT)); 

VARC:=ABS((BREA<-2)/((BREA<*BREA<)-l));  {COTTUTE  VARIANCE} 
IF  CM  >  (Z*SQRT(VARC))  TVEN  (SEE  IF  STATISTIC  CM  PASSES 

TEST} 

BEGIN 

N:=2*N;  (IF  NOT,  INCREtENT  N  AND  K  AMD 

RETRY} 

K;=K/2; 

END; 

SUMSaR;=0; 

MT0T:=0; 


IF  CM  <  (Z*SaRT(VARC))  TFEN  (IF  CM  DOES  PASS  Tl-E  TEST} 
BEGIN 

D;=N;  (INDEPENDENCE  l-AS  BEEN 

OBTAINED} 

K:=0; 

FOR  I:=l  TO  BREA<  DO 
BEGIN 

MTOT:=MTOT+MUCI];  (CALCULATE  THE  GRAMD  MEAN} 

SUMSGJR:  =SUMSQR+SaR(MUC  13); 

END; 

(FIND  APPROPRIATE  T-VALUE} 

IF  NUMFILE=1  THEN 
T:=12.71; 

IF  NLJhriLE=2  THEN 
T:=4.303; 

IF  NLJrnLE=3  TVEN 
T!=3.  1B2; 

IF  NLJMFILE=4  THEN 
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T:=2.776; 

IF  NUmLE=5  Tl-EN 
T:=2.571; 

IF  <NLJMFILE=6)  OR  (NLJmLE=7)  THEN 
T:=2.4; 

IF  (NUMFILE=8)  OR  (NUrFILE=9)  THEN 
T;=2.3; 

IF  (NLmLE>=10)  OR  (NLJMFILE<=13)  THEN 
T:=2.2; 

IF  (N1J^ILE>=14)  OR  (NLJrnLE<=27)  THEN 
T:=2. 1; 

IF  (NLJMFILE>=2a)  OR  (NLJmLE<=120)  Tl-EN 
T:=2.0; 

IF  NLjmLE>120  THEN 
T:=1.96; 


HEAN:  =MTQT/BRE(=»<;  {COMMUTE  PND  V(=«IANCE} 

rE^vEQR:  =SaR  (MEAN) ; 

SKSQ:  =  <  (SLMSaR-(BREAi<*ME^^NJSQR) )  /  (BREA^-1 ) ) ; 
SK:=SQRT(SKSQ); 

HW:=T*(SK/(SG1RT(BREA<<)));  {COrT^UTE  CONF  INT  Hk^FWIDTH) 

Q0T0XY(23,9);WRITE(' INDEP  WITH  %BREA*</  BATCHES’); 
G0T0XY(23, 11);WRITE(’EST.  I-EAN:  ’,^EAN:6;2); 

QOTOXY (23, 12); WRITEC’ VARIANCE;  ’,SKSa:6:2); 

LOW:=HEW-HW;  {COHPUTE  CONFIDENCE  INT) 

HIGH;  =tEAN+HW; 

QOTOXY  (21, 15);  WRITE  (LOW;  9;  2,  ’  <=  HEW  <=  ’,  HIGH;  6;  2); 


END; 

END; 

(GIVEN  TH^T  INDEPENDENCE  IS  NOT  OBTAIHED  BEFORE  THE  NUMBER  OF 
ENTITIES 

PER  BATCH  FALLS  BELOW  8,  THEN  THE  FOLLOWING  HESSAGE  IS  DISPLAYED) 

IF  D=0  THEN 
BEGIN 

Q0T0XY(17, 12);WRITELN(’TEST  FOR  INDEPENDENCE  FAILED,  HEED  MORE 

DATA’ ) ; 

END; 

READLN; 

END; 

END; 

PROCEDURE  SST; 

(This  procedure  is  called  after  the  user  has  seen  a  graphical 
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representation  of  the  data,  with  confidence  intervals,  and  has 
an  idea  where  the  transient  period  terminates.  The  program 
queries  the  user  for  this  point,  and  then  modifies  the  current 
.dct  directory  to  build  a  .sst  directory  of  steady  state  data. 

After  the  directory  is  built  the  STSTATE  and  Batch  procedures 
are  called  to  obtain  the  data's  confidence  interval.) 

VAR 

SSTlviS^E  :URI; 

SST 

1,J,M,N,G0  iHvTTEGER; 

PAUSE  :CHPR; 

BEGIN 

CLRSCR; 

TEXTCOLOR(WHITE) ; 

BOXES; 

TEXTCQLQR ( YELLOW) ; 

G0T0XY(21,6);WRITE('CREATE  STEADY  STATE  DIRECTORY'); 

ir  TE^pnn=o  then  ccheck  to  m<E  sure  a  directory  is 

LOADED) 

begin 

Q0T0XY<17, 10);WRITE(’N0  DIRECTORY  IN  USE'); 

GOTOXYOy,  12);  WRITE  ('PLEASE  PRESS  <RETLJRN>  FOR  MAIN  MENU'); 

PAUSE: =READKEY; 

END; 

IF  TEMPI  IKX)  TTEN  IGIVEN  TWT  A  DATA  FILE  IS  LOADED) 

BEGIN 

lAt  this  point  the  program  will  echo  the  cutoff  value  loaded  in  the 
GRAPHIT  procedure.  The  user  will  then  be  queried  for  the  name  of 
the  Steady  State  file  which  will  be  generated  by  truncating  the  first 
"cutoff"  number  of  values  from  each  sequential  file.  The  program 
will 

take  the  user's  input  name  and  add  the  extension  ".SST"  ) 

Q0T0XY(17,9);WRITE('CUT0FF  '^OLUE  IS  ’, CUTOFF); 

Q0T0XY(17, 10);WRITE(’lvWME  OF  STEADY  STATE  DIRECTORY:  '); 

QOTOXY ( 18, 1 1 ) ; READLN(SSTNAME) ; 

SS'nv*^:=SSnvy¥t+'.SST';  CADD  Tl-E  .SST  EXTENSION  TO 

FILETvi^) 

ASS1GN(SST, SSTNAFE) ; 

REWRITE (SST);  fOPEN  THE  FILE  FOR  DATA) 

NLI1:=NLJMENT-aJT0FF;  (DEFINE  THE  NUMBER  OF  VALUES  PER 

FILE) 

WRITELN(SST,NUmLE);  <HM<E  THE  NUMBER  OF  FILES  THE  FIRST 

ENTRY) 
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WRITELN(SST,NLJM);  THE  NUMBER  CF  VPi.UES  PER  FILE, 

SECOND} 

M:=l;  N:=l; 

FOR  I:=l  TO  NUTFILE  DO  <READ  IN  TPE  V«_UES  FROM  THE  ORIGINAL 

DATA} 

BEGIN  CSET,  TRUNCATING  TVC  FIRST  "CUTOFF" 

V«_UES} 

FOR  J:=l  TO  NUMENT  DO 
BEGIN 

IF  J<CUTOFF  n-EN 
M:=M+1; 


IF  J>=CLrTOFF  TVCN 
BEGIN 

WRITELNCSST,  TETFCM}  ) ; 

TEm=CN]:=TEMPCM}; 

N:=N+1; 

M;=M+1; 

END; 

END; 

END; 

CLOSE (SST);  <CLOSE  TFE  ivEW  .SST  FILE} 

CLRSCR; 

BOXES; 

<Now  that  the  steady  state  data  file  is  built,  determine  if  the 
user  wants  to  evaluate  the  new  file  via  Batch  Means  (one  long  run) 
or  Independent  Replications  (Multiple  runs)} 

G0T0XY(21,6);WRITE('CREATE  STEADY  STATE  DIRECTORY’); 

GOTOXY(  17, 10);  WRITE  (’DIRECTORY  ’,SSnv*^,’  WG  BEEN  CREATED’); 
G0T0XY(17, 12);WRITE(’  1)  INDEPENDENT  REPLICATIONS’); 

G0T0XY(17, 13);WRITE(’  2)  BATCH  1*0^  EV«_UAT ION’ ) ; 
Q0T0XY(17,14);WRITE(’  3)  DATA  MENU'); 

G0T0XY(17, 19);WRITE(’PLEASE  ENTER  NUMBER  OF  CHOICE  ;  ’); 

READLN(QO); 

IF  (00=1)  AND  (NUr>FILE=l)  TPEN 
BEGIN 

TEXTai.OR(GREEN) ; 

QDTOXY( 17, 16); WRITE  (’MUST  USE  BATCH  ftANS  FOR  SINGLE  DATA  FILE’); 
Q0T0XY(17, 17);WRITE(’PRESS  <RETLJRN>  FOR  BATCH  ME«sB  PROCESS.’); 
TEXTCOLOR( YELLOW) ; 

READLN; 

GO: =2; 

END; 

IF  (002)  PND  (NUtFILE>l)  1>EN 
BEGIN 
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TEXTCOLGR (GREEN) ; 

GOTOXYC 17, 16); WRITE  (’MUST  USE  IND  REPLICATIONS  FOR  MULTIPLE 
FILES’ ); 

Q0T0XY(17, 17);WRITE(’PRESS  <RETURN>  FOR  IND  REP  PROCESS.’); 
TEXTCOLOR ( YELLOW ) ; 

READLN; 

GO;=l; 

END; 

CASE  GO  OF 

1  :  STSTATE; 

2  :  BATCH; 

END; 

END; 

END; 


PROCEDURE  GRAPHIT; 

(This  procedure  graphs  the  data  in  the  current  ,dct  directory  for  the 
user  to  review.  It  then  calls  to  the  SST  procedure  to  build  a 
directory 

for  the  steady  state  data. ) 

type  sett  =  array  C 1 . . 101 , 1 . .23  of  real; 
var 

PRYNT, pause  ;CHAR; 

GraphDriver  :  integer;  {  The  Graphics  device  driver  > 

GraphMode  :  integer;  <  The  Graphics  mode  value  ) 

Error(3ode  :  integer;  i  Reports  any  graphics  errors  > 

MaxColor  ;  word;  i  The  maximum  color  value  available  > 

OldExitProc  ;  Pointer;  <  Saves  exit  procedure  address  ) 

step  :  INTEGER; 

values  :  sett; 

x,Y,MAXADD,VAL,YY  :  integer; 

max, scale  :  real; 

WRITING  ;  WRI; 

YAX IS, I, START, STOP  ;  INTEGER; 

Vf^UE,YYY  :  STRINGC53; 

CHOICE, SER,NUMGRPH  ;  INTEGER; 

BEGIN 

STEP:=1; 

IF  MJMENT<600  THEN 

STEP;  =R0UND  ( 600/NLJMENT ) ; 

MAX  ;=0; 

FOR  X;=  1  TO  NLI’Bvn’  DO 

IF  (MA()GCX3>f’AX)  OR  (MAVGCX:*-l>r’AX)  THEN  BEGIN 
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MAX;=  r-^VGCX]; 

MAXADD:=X; 

END; 

IF  MAX<0  TVEN  WX:=nAX*-l; 

IF  MAX=0  TVEN  EXIT; 
graphdr iver : =2; 
graphmode: =4; 

InitGraph(graphdriver,graphmode, '  ’  <.  activate  graphics  } 

ErrorCode  :=  GraphResult;  C  error"’  > 

i f  ErrorCode  <  >  grOk  then 
begin 

WritelnC'Graphics  error;  GraphErrorMsg (ErrorCode) ) ; 
Halt(l); 
end; 

CHDICE:=1; 

SER: =1 ; 

^^JMQRPH;  =  (TRL^C  (NUr^^ 

ST(2RT:=1; 


WHILE  CHOICE  <>  3  DO 
BEGIN 

CLEflRDEVICE; 

IF  SER=0  THEN  (ERROR  MESSAGE  IF  USER  TRIES  TO  BACK  LP> 

BEGIN  (WVEN  ST«?TING  PT  IS  ON  SCREEN) 

SER;=1; 

ST#«T;=1; 

0LnTEXTXY(17, 165, ’STATING  POINT  IS  A_READY  ON  SCREEN’); 

END; 

IF  SER>NUMGRPH  TVEN  (ERROR  MESSAGE  IF  USER  TRIES  TO  GO  FORWARD) 
BEGIN  (WhEN  ENDING  PT  IS  ON  SCREEN) 

SER:  =NUMGRPH; 

ST«RT;=ST(^T-600; 

0UTTEXTXY<17, 165, ’END  OF  OUTPUT  DATA  IS  ON  SCREEN’); 

END; 

LI^E (25, 140,625, 140);  (DRAW  AXES  AND  TICK  MARKS) 

LINE  (25, 40, 25, 140); 

LINE(85, 140,85, 143); 

LINE( 145, 140, 145, 143) ; 

LItsE(205, 140,205, 143) ; 

LirvE(265, 140,265, 143) ; 

LIIVE(325, 140,325,143); 

Lir€(3B5, 140,385, 143) ; 

LI^E(445, 140,445, 143); 

L1^E(505, 140,505, 143) ; 
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LINE<565, 140,565,  143); 

LllvE(625, 140,625, 143); 

FCR  I:=l  TO  10  DO  <NLMBER  THE  Y-AXIS) 

BEGIN 

YY:=TRUNC( ( 100 -( 10* ( I-l )))« (MAX/100) ) ; 
STR(YY,YYY); 

□UTTEXTXY(5,  (40K  ( I-l  )*10)  )  ,  YYY)  ; 

END; 


FOR  1:=1  TO  10  DO  (NUMBER  TT-E  X-AXIS) 

BEGIN 

VAL:  =  (  (ST(^T-1)  +  (TRUNC(<  1*60) /STEP))); 
STR(V(^,V(2y_UE); 

□UTTEXTXYC (23+( 1*60) ) , 147, VALUE) ; 

END; 


M0VET0(25, 140);  (DRAW  TPE  600  DATA  PTS  STARTING  WITH 

(START-1 )> 

STOP:  =  (600+ (START  D); 

IP  STaP>NLJMENT  THEN 
STOP:  =NLI'CNT ; 

FOR  Y:=  START  TO  STOP  DO 

Lir«ET0(25+  ( (Y-  (START-  1 )  )*STEP) ,  130-round  ( (I^VG[Y]/max)*100) ) ; 
OUTTEXTXY(  17, 155,  ’  DLAST  SCREEN  2) NEXT  SCREEN  3)  ID  TRANSIENT’ ) ; 

G0T0XY(5O,21);READLN(CH0ICE);  (^OWS  USER:  TO  MOVE  ThFOUGH  TKE 
DATA) 

IF  CH0ICE=1  n-EN  (RESET  STARTING  PT  ACCORDINGLY) 

BEGIN 

SER:=SER-1; 

START: =START -600; 

END; 

IF  CHDICE=2  THEN 
BEGIN 

SER:=SER+1; 

START:  =START+600; 

END; 

END; 

(When  the  user  determines  a  "cutoff"  value  for  the  transient,  based  on 
the  graph  he/she  types  "3"  and  is  queried  for  the  cutoff  value.) 

OLnTEXTXY(  17,  175, 'WHERE  DO  YOU  THIlvK  TRA^IENTT  ENDS'’  :'); 
Q0T0XY(50,23); 

REA^XJM  ( CUTOFF ) ; 
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{If  the  uset^  wcxild  like,  the  graph  can  be  duniped  to  the  printer} 

□UTTEXTXYC 17, 185, ’WOULD  YOU  LIKE  A  PRINTOUT^  (Y/N):  ' ) ;PRYNT: =READKEY; 

IF  UPCASE(PRYfsrD=’Y’  THEN 
PRTSC; 

TEXTMODE (CBO); 

SST; 

END; 


PROCEDURE  AVERAGE; 

{This  procedure  can  be  accessed  from  the  MAIN  MENU  or  from  the 
RECALL  procedure.  It  averages  the  values  across  replications  and 
then  performs  a  Moving  Average  algorithm  with  the  moving  window 
size  determined  by  the  user.  This  data  is  then  forwarded  to  the 
GRAPHIT  prccedure  to  be  displayed  and  evaluated.} 

VAR 

I,J,K  '.INTEGER; 

TOT  ;(=RA; 

BEGIN 

FOR  I;=l  TO  NUMENT  DO  {INITIALIZE  PEL  ARRAY  VALUES  TO  ZERO} 

BEGIN 

T0TCI]:=0; 

HAVGCn:=0; 

END; 

FOR  I:=l  TO  NUMENT  DO  {FOR  EACH  ENTRY  IN  THE  OUTPUT  FILE) 

BEGIN 

K;=I; 


FOR  J:=l  TO  (NUrriLE)  DO  {FOR  EACH  OUTPUT  FILE} 


BEGIN 

TOTC I J : =TOTC I D+TEMPCKI; 

V«_UE} 

K:  =  (J*NUrENT)  +  I; 

COUNTER} 

END; 

mVGC  I  ] ;  =TOTi:  I D/NUTFILE; 
VALUES} 

END; 

GR^IT; 

END; 


{COMMUTE  A  TOTAL  VALUE  FOR  THE  ITH 
{INCREMENT  THE  ORIGINAL  FILE 

{COMPUTE  THE  AVERAGE  OF  ALL  ITH 

{GRAPH  THE  AV,ERAGE  VALUES} 


PROCEDURE  MDVAVG; 

{This  procedure  parallels  the  AVERAGE  procedure,  but  is  used  when  there 
is  only 

one  output  file  in  the  directory.  The  procedure  uses  the  Moving 
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Average 

algorithm  noted  by  Welch  in  the  "Computer  Performance  Modelling 
Handbook" , 

averaging  across  the  specified  "window"  to  smooth  the  data.  The  data 
generated  will  be  forwarded  to  the  GRAPHIT  procedure  for  display) 


VAR 

J,M,N  : INTEGER; 

TOT  :ARA; 

WINDOW  ; INTEGER;  WIDTH  OF  MOVING  WINDOW) 

BEGIN 

BOXES; 

TEXTCOLORC YELLOW); 

QOTOXYOO,6);  WRITE  (’MOVING  AVERAGE  ’); 

Q0T0XY(17,10);WRITE(’  NO.  OF  FILES  IN  DIRECTORY;  ’.NUMFILE); 
ODTOXYCl?, 12);WRITE(’  NO.  OF  ENTRIES  PER  FILE:  ’.NUhtNT); 

CQLERY  Tl-E  USER  FOR  THE  MOVING  AVERAGE  WINDOW  SIZE) 

QDT0XY(17, 14);WRITE(’l-k=LFWIDTH  OF  MOVAVG  WINDOW 
•”  ) ;  READLN  ( W I NDOW ) ; 

FOR  N:=l  TO  NUTtNT  DO 

T0TCN);=0;  <INITIA4,IZE  ALL  TOTALS  TO  ZERO) 

FOR  N;=l  TO  NUrtNT  DO  <RUN  THE  WINDOW  ACROSS  ALL  ENTITIES) 
BEGIN 

IF  N<(WINDOW+l)  THEN  (FOR  THOSE  BEFORE  Tl-E  WINDOW  WIDTH) 

BEGIN 

FOR  M:=l  TO  ((2*N)-1)  DO 
BEGIN 

TOTCN] ;  =TOTCN3-t-TEMPCM) ; 

END; 

MAVGCN] ; = ( 1 / ( <2*N)-1 ) )*TOTCN3 ; 

END; 

IF  N>=(WINDOW+l)  n-EN 
BEGIN 

(After  the  entry  is  large  enough  to  permit  the  full  window 
to  be  used  in  the  average  the  first  algorithm  below  is  used. 
This  algorithm  is  used  until  the  values  do  not  allow  for 
a  full  window  size,  due  to  the  end  of  the  file.  Then 
the  second  algorithm  below  is  used) 

IF  NK  (NUMENT -WINDOW)  THEN  (FIRST  ALGORITHM) 

BEGIN 

FOR  M:=(N-WINDOW)  TO  (NHJINDOW)  DO 
TOTCN) ; =T0TCN3+TEMPCM] ; 
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WVA3CN];  =(  l/(  (2H4INDCIW)  +  1)  )*T0T(:N3; 

EMD; 

IF  N>=(NLJ'tNTHJINDC)UI)  TVEN  (SECOND  PLGORITI-M} 

BEGIN 

J;  =  (NUt'ENT-N); 

FOR  M:=(N-J)  TO  NUMENT  DO 

totcn: : =totcn]+tem=cm: ; 

mVGCN]:=T0TCN]/((2*J)+l); 

END; 

END; 

END; 

GRAPHIT;  (GRAPH  T>C  MOVING  AVERAGES} 


END; 


PROCEDURE  NEWDATA; 


(This  procedure  is  called  from  the  DATA  MENU  to  develop  a 
new  directory  for  use  by  the  program.  It  builds  the  directory 
based  on  user  inputs  for  the  number  of  files  to  be  input  (NUhEILE) 
and  their  common  length  of  entries  (NUMENT).  Inputs  are  prompted 
by  the  program.  Upon  completion  the  user  selects  to  evaluate  the 
transient  for  either  single  or  multiple  output  files,  as  appropriate.) 


WPR  I,J,M 

FILrv^V^,D 

FIL,DCT 

00 

TEST,TESST 


INTEGER; 

WRI; 

FYLE; 

CmR; 

INTEGER; 


BEGIN 

BOXES; 

TEX TCOLORC YELLOW); 

GOTOXYOO,6);  WRITE  (’BUILD  IVEW  DIRECTORY  ’); 

Q0T0XY(17, 10);WRITE(’  OF  NEW  DIRECTORY: ’)  ;READLN(DCTmiE) ; 

DCTlvWrti^DCTTvWft+’.DCT’;  (T«<E  TVE  USERS  1N=UT  WD  ADD  .DCT) 

(At  this  point  the  user  is  queried  for  the  number  of  simulation 
run  outputs  within  the  file  and  their  length) 

Q0T0XY(17, 12);WRITE(’  NO.  OF  FILES  IN  DIRECTORY: 
’);READLN(NUmL£); 

QCrrOXY(17, 14);WRITE(’  NO.  OF  ENTRIES  PER  FILE;  ’ )  ;READLN(NUrtNT) ; 


ASSIGN(DCT,DCT7vW^);  (ASSIGN  A  STRING  DESIGTWTOR  TO  THE 
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P«RAY> 


REWRITE(DCT);  {OPEN  TFE  IvEW  DIRECTORY  FOR  USE} 

WRITELN(DCT,NLJMFILE);  {FIRST  ENTRY  IS  Tl-E  NUMBER  OF  OUTPUT 

FILES} 

WRITELN(DCT,NUrtNT);  {SECOND  ENTRY  IS  TFE  NO.  OF  ENTITIES  PER 

RUN} 


M:=l; 


TEST:=1; 

TESST:=4; 

WHILE  TEST=1  DO 
BEGIN 

CLRSCR; 

BOXES; 

TEST: =2; 

G0T0XY(30,6);WRITE(’  FILE  ENTRY'); 

{Now  the  simulation  output  file  to  be  read  from  is  specified} 
G0T0XY(17, 10);WRITE('NWrt  OF  FILE  :  ’ )  ;READLN(FlLN»!¥t) ; 
ASSIGN(FIL,FILM^);  {ASSIGN  THE  STRING  "FIL"  TO  THE  ORIGIIv*=L 

FILE} 


IF  (EXIST (FILNArt)=FA_SE)  TT-EN 
BEGIN 

G0T0XY(17, 12);WRITELN('FILE  DOES  NOT  EXIST’); 

Q0T0XY(17, 14);UJRITE(’ 1)  TRY  AQ^IN'); 

Q0T0XY(17, 15);WRITE(’2)  DATAIENU’); 

QaTQXY(17, 19);WRITE(’PLEASE  SELECT  NUMBER  OF  CHOICE:  '); 
Q0T0XY(49, 19); 

QO:=READKEY; 

CASE  GO  OF 

’1’  :  TEST:=1; 

’2’  ;  TESST:=3; 

END; 

END; 

END; 


IF  TESST  0  3  TVEN  {GIVEN  TWT  TIE  FILE  EXISTS} 

BEGIN 

RESET(FIL);  {OPEN  FIL  TO  BE 

READ} 

FOR  I:=  1  TO  NUMFILE  DO 
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CFGR  1>E  NUMBER 


BEGIN 

SPECIFIED} 

FOR  J:=l  TO  NUrtNT  DO 


BEGIN 

READLN(FIL,TEr^CM3);  {READ  THE  DATA  VALUE  FROM 

"FIL"} 

WRITELN(DCT,TEM=>CM3);  {URITE  IT  INTO  "DCT") 

M:=M+1; 

END; 

END; 

CLOSECFIL);  {CLOSE  Tl-E  FILES  TO  FURTHER 

OPERATION} 

CLOSE (DCT); 

BOXES; 

{The  user  is  then  asked  to  make  a  choice  from  the  menu  below.  If 
the  data  has  been  loaded  correctly,  he/she  can  proceed  to  the 
transient  identification  section.  Decision  between  the  first  two 
options  is  based  on  whether  there  are  multiple  simulation  output 
files  or  only  one.  If  the  data  was  not  loaded  correctly,  then  the 
usei-  can  return  to  the  data  menu  at  this  pt.} 

G0T0XY(17,12);  WRITE  (’DIRECTORY  ’,DCTmt,’  IS  NOW  BUILT.’); 
G0T0XY(17,13);  WRITE(’l)  IDENTIFY  TRrtvBIENT,  MULTIPLE  FILES’); 
Q0T0XY(17, 14);  WRITE(’2)  IDENTIFY  TRANSIENT,  SINGLE  FILE’); 
Q0T0XY(17, 15);  WRITE(’3)  DATA  fENU’); 

G0T0XY(17, 19);  WRITE (’PLEASE  ENTER  APPROPRIATE  CHOICE:  ’); 
QO:=READKEY; 

CASE  GO  OF 

’1’  :  AVERAGE;  {GIVEN  MULTIPLE  OUTPUTS  THIS  OPTION  IS 

CHOSEN} 

’2’  :  MOVAVG;  {fV^PROPRIATE  FOR  SINGLE  OUTPUT  FILE} 

END; 

END; 

END; 


PROCEDURE  REC«_L; 

{This  procedure  is  used  vpjhen  the  data  to  be  used  has  been  loaded 
previcxjsly. 

It  can  be  accessed  either  directly  from  the  main  Data  Menu  or  through 
the 

View  Procedure  listed  below.  In  either  case  this  procedure  queries  the 
user  for  the  .DCT  directory  to  be  loaded,  and  upon  valid  input  recovers 
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the  directory.  It  then  prints  the  number  of  files  and  their  length  on 
the 

screen  for  the  user's  information,  along  with  providing  three  options. 
These  options  are  identical  to  those  offered  in  procedure  Newdata 
above. 

They  are  to  Return  to  the  Data  Menu  or  identify  the  transient  for 
multiple 

or  single  files.} 

VPR  I, J,M,N,TEST,TESST 
FILNPrC 
FIL,DCT 
GO 

BEGIN 

TEXTCOLORCYELLOUI) ; 

NLJMFILE:=0;  (INITIALIZE  STATISTICS) 

NLJt'tMT:=0; 

TEST:=1; 

TESST:=4; 

WHILE  TEST=1  DO 
BEGIN 

CLRSCR; 

BOXES; 

TEST: =2; 

QDTOXYOO,6);  WRITE  (’RECALL  Dlf^CTORY 

GOTOXYCl?,  10);WRITE(’  DIRECTORY  TO  RECALL: ’)  ;READLN(DCnvA^) ; 
DCnvk^:=DCn\A^E+’ .DCT’ ;  (ADD  .DCT  EXTENSION  TO  FILE  TO  RECALL) 


WRI; 

FYLE; 

OAR: 


IF  (EXIST(DCT^A^E)=F(=LSE)  Tl-EN 
BEGIN 

GOTOXY(17, 12);WRITELN(’FILE  DOES  NOT  EXIST’); 

Q0T0XY(17, 14);WRITE(’ 1)  TRY  ACTIN’); 

Q0T0XY(17, 15);WRITE(’2)  DATAftNU’); 

Q0T0XY(17, 19);WRITE(’PLEASE  SELECT  NUMBER  OF  CHOICE:  ’); 
QDT0XY(49, 19); 

GO:=READKEY; 

CASE  GO  OF 

’1’  :  TEST;=1; 

’2’  :  TESST:=3; 

END; 


END; 

END; 

IF  TESST  0  3  TTEN 
BEGIN 
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ASSIGN(DCT,DCTNP^) ; 
RESET <DCT); 


{ASSIGN  STRING  FOR  FILE  OPERATION 
{OPEN  TPE  DIRECTORY  TO  BE  READ) 


READLNCDCT.NUMFILE);  {READ  Th€  NUMBER  OF  FILES  AND  LENGTH} 

READLN ( DCT , MOMENT ) ; 

M:=l; 

FOR  I:=l  TO  MJMFILE  DO 
BEGIN 

FOR  J:=l  TO  NUMENT  DO 
BEGIN 

READLN(DCT,TE^PCM:);  {READ  EACH  VALUE  INTO  TVE  ARRAY 

TEMPCI) 

M:=M+1; 

END; 

END; 


{Now  that  all  the  data  is  loaded  in,  the  program  provides  the 
user  with  the  number  of  files  and  their  length  to  verify  that 
it  IS  the  file  he/she  wanted.  If  it  is  then  the  transient  can 
be  identified.  If  it  is  not,  then  the  user  can  return  to  the 
Data  Menu  and  try  again.) 

G0T0XY(17, 10);WRITE(’  NO.  OFFICES  IN  DIRECTORY:  MJMFILE); 

G0T0XY<17, 11);WRITE('  NO.  OF  ENTRIES  PER  FILE:  ',NUftNT); 
G0T0XY(17, 13);  WRITEC'l)  IDENTIFY  TRf^IENT,  MULTIPLE  FILES’); 
G0TQXY(17,  14);  WRITE<’2)  IDENTIFY  TRWSIENT,  SINGLE  FILE’); 
GOTOXY(17, 15);  WRITE('3)  DATAftNU’); 

GOTOXYf 17, 19); URITEC’ PLEASE  SELECT  NUMBER  OF  CHOICE:  ’); 
Q0T0XY(49, 19); 

QO:=READKEY; 

CASE  GO  OF 

’1’  :  AVERAGE;  {(APPROPRIATE  FOR  MULTIPLE  OUTPUT  FILES) 

'2’  :  MOVAVG;  {APPROPRIATE  FOR  SINGLE  OUTPUT  FILE) 

END; 

END; 

END; 

PROCEDURE  VIEW; 

{This  procedure  is  called  when  the  user  asks  for  a  list  of  the 
existing  data  directories.  It  scans  the  current  computer  directory 
for  all  files  with  the  .DCT  extension  and  lists  them  on  the  screen. 
The  user  then  has  the  option  of  returning  to  the  niain  Data  Menu  or 
recalling  one  of  the  files  listed  through  the  Recall  procedure.) 

VAR 

DlRIlvFO:  SE(PCH?EC; 

CMD  ;  OWR; 

BEGIN 
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CLRSCR; 

TEXTCXa_aR(  YELLOW) ; 

QQT0XY(30,6);WRITELN(’LISTING  OF  DIRECTORIES'); 

WRITELN; WRITELN; WRITELN; 

FINDFIRST('*.DCT’ ,|:«CHIVE,DIRI^FO);  (LOCATES  FIRST  .DCT  FILE) 

WHILE  DOSERROR  =  0  DO  (UNTIL  THERE  NO  MORE 

FILES) 

BEGIN 

WRlTELNCDIRIlsFO.NPft);  (WRITE  TT-E  FILENP^E  TO 

SCREEN) 

FINDlvEXT(DIRItvFO);  (FIND  NEXT  .DCT  FILE) 

END; 

G0T0XY(17,20); 

WRITEC'TYPE  1  TO  LOAD  A  FILE,  2  FOR  DATA  MENU:  ’); 

CMD: =READKEY; 

CASE  CMD  OF 

'1':  RECALL;  (USED  TO  LOAD  OtvE  OF  THE  FILES  LISTED) 

END; 

END; 


PROCEDURE  DATA; 

(This  procedure  is  the  starting  point  of  the  program.  When  the  program 
.  is  initialized,  this  screen  is  displayed.  The  user  must  then  decide 
whether  he/she  wants  to  build  a  new  data  directory,  recall  one  whose 
name  is  known,  get  a  list  of  the  directories  which  currently  exist, 
or  exit  from  the  program.  Exiting  from  the  program  is  always  done 
from  the  Data  Menu.  At  the  completion  of  the  Batch  Means  and 
Independent 

Replications  routines  the  user  is  automatically  returned  to  the  Data 
Menu.  There  are  also  several  opportunities  within  the  process  to 
return 

to  this  menu  and  start  over.) 


00  :Cm?; 

BEGIN 
Q0:='3' ; 

WHILE  GO  <>  '0'  DO 
BEGIN 
BOXES; 

TEXTCOLOR(YELLOW) ; 

QDT0XY(30,6);  WRITE (’DATA  OPERATIONS  ’); 

Q0T0XY(17, 19);  WRITE( ’Enter  number  of  selection:  ’); 
GDT0XY(17,10);WRITE(’  1)  BUILD  NEW  DATA  DIRECTORY'); 

QDT0XY(17,11);WRITE('  2)  RECALL  EXISTING  DIRECTORY'); 

QDT0XY(17, 12);WRITE('  3)  LIST  EXISTING  DIRECTORIES'); 
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Q0T0XY(17, 15);WRITE(’  0)  EXIT’); 

gotoxy(44, 19) ; 

Q0:=READhXY; 

CASE  00  or 

'1':  NEWDATA;  {LEED  TO  BUILD  A  NEW  DIRECTORY) 

’2’:  RECALL;  <LEED  TO  REC^  A  DIRECTORY  WHOSE  Nk^^E  IS 

KNOWN} 

’3’:  VIEW;  {USED  TO  GET  A  LIST  OR  EXISTING 

DIRECTORIES} 

END; 

END; 

END; 

END. 
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Appendix  D 


Aux  i  1  i  ar  V  Turbo  Pasc  al  Progr  ams  Used 


PROGfWt  LININT; 

<THIS  PROQR^^  WILL  PROVIDE  A  LINEPrf?  If\rrERPa_ATIQN  OF  A  DATA  SET, 
PROVIDING  EQLm_Y  SPACED  VALUES  AT  INCRE^E^ITS  OF  10  TIME  UNITS) 

TYPE 

WRI  =  STRINGC20]; 

FYLE  =  TEXT; 


VPR 


Tir-E,NEW 

FYLE; 

I,J,K,L 

INTEGER 

TIM,MvEW 

WRI; 

OfvE,TWO,Y,mX 

RE/24.; 

A1,A2,B1,B2 

RE/^; 

LL 

WRI; 

{Declare  text  files  for  data  transfer) 
{Declare  names  to  be  associated  w/files) 

{Declare  coordinates  to  be  used) 


BEGIN 


FDR  L:=l  TO  50  DO  {For  the  50  SLAM  II  output  files) 

BEGIN 

STR(L,LL);  {Define  the  number  L  as  a  word) 


ASSIGNCTIME, 'Ma^E'+U.+' .OUT' );  {Identify  each  output  file  in 

order) 

ASSIGNCrsEW, 'EQUAL' +LL+' .DAT' );  {Build  the  Lth  new  file) 

RESET (TIME) ;  {Reset  the  output  file  to  be  read) 

REWRITE(NEW) ;  {Initialize  the  new  file  to  receive 

data) 


A2:=0.0; 

0,0) 

A1:=0.0; 

I:=10; 

K:  =  l; 


{Start  the  interpolation  at  the  point 


{Initialize  the  observation  step  at  10) 
{Initialize  counter  variable  to  1) 


READ(TIME,Bl);READLN(TirE,B2);  {Read  the  data  from  the  output 

file) 


WHILE  K<=1000  DO  {For  the  first  1000  data  values) 

BEGIN 

IF  (I>A1)  AND  <I<B1)  TVEN  {If  I  is  between  the  two 

observat ions) 

BEGIN 


lie 


Y;=(((B2-A2)/(B1-A1))«(I~A1))+A2;  {Compute  the  value  for  I 
on 

a  1  me  between  A  and  B> 

I:=I+10;  {Increment  I  to  I+IO) 

WRITELNCNEW, Y) ;  {Write  the  interpolated  value  to  new 

file} 

END; 

ir  I>B1  THEN  {If  1  doesn't  fall  between  A  and  Bl 

BEGIN 

A1:=B1;  {Reset  the  values  of  A  to  B) 

A2: =B2; 

READ(TIME,Bl);READLN(TIfE,B2);  {Read  the  data  from  output 

f  ile> 

K:=K+1;  {Increment  to  the  next  observation} 

END; 

END; 

CLOSE (NEW ); CLOSE (TIME) ;  {After  each  file  is  done,  close  the  files} 
END; 

END. 
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PROGRW  CONFILE; 

{This  program  concatenates  portions  of  sequential  data  files.  Its 
purpose  is  to  form  data  files  of  2000  values  from  the  equally 
spaced  data  files  generated  by  LININT.PAS,  for  use  by  the 
software  package  FORECAST  MASTER > 

TYPE 

WRI  =  STRINGC20:; 

FYLE  =  TEXT; 

VAR 

Tift,  NEW  :FYLE; 

I,J  :  INTEGER; 

TIM,MvEW  :WRI; 

ONE, TWO  :REAL; 

FILENAIt  ;WRI; 

rvEWtvk=rt,  JJ  :WRI; 

BEGIN 

FOR  J:=l  TO  50  DO 
BEGIN 

STRCJ, JJ); 
than  #> 

FILENAME: =<’FCM’+JJ+' .DAT’ );  {Match  identifiers  to  the  Jth  file) 
fNEWNAft:  =  <’K50.DAT’ ) ;  {Generate  the  name  of  output  file) 

ASSIGNCTirt,  FILENAME) ; 

WRITELNCFILENAft ) ;  {Writes  filename  to  screen, as  a 

check) 

ASSIGN(NEW, NEWNAME) ; 

RESET  (Tilt) ;  {Resets  the  output  file  to  be  read) 

APPEND (^EW);  {Prepares  the  file  to  be  appended) 

FOR  I:=l  TO  460  DO 
BEGIN 

READLN (TIME, TWO) ;  {Simply  read  the  first  460  values) 

END; 

FOR  I:=l  TO  40  DO 
BEGIN 

READLN(TIIt,TWO) ;  {Place  the  last  40  values  from  the  50  files) 
WRITELN<NEW, TWO);  {into  the  newfile,  for  a  total  of  2000  obs. ) 
END; 

aJDSE(NEW);CLOSE(TIft);  {Close  the  files  at  end  of  each 
iteration) 

END; 

END. 


{For  the  50  SLAM  II  output  files) 
{Makes  J  a  string  (word)  rather 


{Declare  file  names  for  input/output ) 
{Declare  counter  variables) 

{Declare  identifiers  for  filenames) 
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Appendix  E:  Array  of  T i me- i n-Svst em  to  T ime-Between-Qbservat ions 

T 

1  Tlas-BatwMA-OtttarvAtlons 

L  1  ^  •  a  4  ?  a  •  t6  11  i2  >a  i*  ta  i4  ty  la  ao  ai  m  aa  24  aa 

k  203  ••  03  SO  33  44  21  30  30  23  ii  10  18  t3  3  3  3  1  7  3  3  4  1  2  3 

2  33  32  30  34  23  11  23  12  3  11  12  3  4  4  1  1  i  2  1  1  3  2  1  2  4 

3  33  44  27  22  17  13  12  1  11  3  11  1  2  3  3  I  3  4  1  0  1  0  2  1  3 

4  70  33  23  11  22  23  12  13  14  3  12  3  2  3  3  3  2  3  1  3  2  0  2  1  4 

8  32  23  23  21  23  3  13  10  10  3  8  4  3  3  3  1  I  10  1  10  1  18 

3  70  37  21  13  3  11  13  13  3  7  3  3  1  3  2  1  7  1  0  0  2  1  0  1  3 

7  34  21  13  11  13  18  10  11  10  7  32  7  2323  1  1  2  1  0004 

3  33  21  13  20  13  1  11  832  3  1  4  1  320  1  3  0  1  00  1  4 

I  37  21  13  20  7  13  11  733323  13000000  1  002 

10  41  13  14  17  13  II  77334331  1  101  1  102101 

II  31  13  10  11  13  10  3844423200  1  230  1  2203 

12  37  17  14  11  37  14  310134  10002  I  020003 

13  43  7  10  10  7  13  3473424201  130001003 

14  33  12  13  10  13  73334434  1  1  43  1  200  1  000 

13  42  13  1  10  31337701  33001  1  t  101000 

13  11  12  3  10  4433332330310001001  12 

17  30  33477323330  1  0203200  1  000  1 

13  22  13  10  332443331  100201  00001  1 

11  23  34323431420401010101  1003 

20  20  733323214131221000100010 

21  13  33134  134  1220001000100001 

22  13  3331331  1  2200010010000002 

23  13  43  12  134021003300100000010 

24  1  10  723231  100220201  lOIOOOOt 

23  13  2782131001  12102010000002 

23  34  332231212031120100000000 

27  21  331401312010112010000100 

23  14341241012101011300010000 
23  3424341201203001100000001 

30  123333224301001  10000100000 

31  3143121201100101100000000 

32  38240001  1  1013201000000000 

33  33401  1001  lOOOeOOOOOOOOOOO 

34  10  344310200221  1  lOlOOOl  1000 

33  4331312102111130000000000 

33  0314303002000000000000000 

27  301010001  1000000000000000 

33  4143033013300200000000000 

23  2131  100000100010000000000 

40  3310111100100101000000000 

41  3001020101010000000000000 

43  3001001  loioooooooooooeooo 

43  4210323000100100010000000 

44  3101010000100900000100000 

43  0310000000000000000001001 

43  1  103102100100000010000000 

47  3100001  100000000100000000 

43  21000001  10102100000000000 

49  2000100000130000000000001 

30  29  13  33  13  33323130010110100002 
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